messages stringlengths 37.7k 1.08M | instance_id stringlengths 9 30 | rollout_patch stringlengths 171 15.1M | func_name stringlengths 1 81 | func_path stringlengths 10 88 | line_level_recall float64 0 1 | problem_statement stringlengths 129 13.6k | target_patch stringlengths 0 1.54M | docker_image stringclasses 121
values |
|---|---|---|---|---|---|---|---|---|
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | MONAI_14513 | "diff --git a/monai/transforms/intensity/dictionary.py b/monai/transforms/intensity/dictionary.py\ni(...TRUNCATED) | RandRicianNoised.__call__ | monai/transforms/intensity/dictionary.py | 0 | "# Fix modularity and reusability issues in RandRicianNoised.__call__\n\n## Description\n\nThe `Rand(...TRUNCATED) | "diff --git a/monai/transforms/intensity/dictionary.py b/monai/transforms/intensity/dictionary.py\ni(...TRUNCATED) | jyangballin/swesmith.x86_64.project-monai_1776_monai.a09c1f08 |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | click_10676 | "diff --git a/src/click/termui.py b/src/click/termui.py\nindex d30dc19..281d4c7 100644\n--- a/src/cl(...TRUNCATED) | confirm | src/click/termui.py | 1 | "Fix reliability and deterministic behavior issues with click.confirm()\n\nDescription\n\t\t \n\t\t((...TRUNCATED) | "diff --git a/src/click/termui.py b/src/click/termui.py\nindex d30dc19..14db158 100644\n--- a/src/cl(...TRUNCATED) | jyangballin/swesmith.x86_64.pallets_1776_click.fde47b4b |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | MONAI_18409 | "diff --git a/monai/transforms/inverse_batch_transform.py b/monai/transforms/inverse_batch_transform(...TRUNCATED) | BatchInverseTransform.__call__ | monai/transforms/inverse_batch_transform.py | 0.5 | "# Resource leak in BatchInverseTransform.__call__ when exceptions occur\n\n### Description\n\nThe `(...TRUNCATED) | "diff --git a/monai/transforms/inverse_batch_transform.py b/monai/transforms/inverse_batch_transform(...TRUNCATED) | jyangballin/swesmith.x86_64.project-monai_1776_monai.a09c1f08 |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | paramiko_11400 | "diff --git a/paramiko/transport.py b/paramiko/transport.py\nindex f0fcb97..cbc43bc 100644\n--- a/pa(...TRUNCATED) | SSHClient.invoke_shell | paramiko/client.py | 0 | "SSHClient.invoke_shell side effects and purity issues\nDescription:\nThe SSHClient.invoke_shell() m(...TRUNCATED) | "diff --git a/paramiko/client.py b/paramiko/client.py\nindex d8be910..97fa24c 100644\n--- a/paramiko(...TRUNCATED) | jyangballin/swesmith.x86_64.paramiko_1776_paramiko.23f92003 |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | scrapy_11482 | "diff --git a/scrapy/statscollectors.py b/scrapy/statscollectors.py\nindex f3dd0f8..be2d299 100644\n(...TRUNCATED) | StatsCollector.get_stats | scrapy/statscollectors.py | 0 | "[Bug]: StatsCollector.get_stats() backwards compatibility issues\n### Bug summary\n\nStarting with (...TRUNCATED) | "diff --git a/scrapy/statscollectors.py b/scrapy/statscollectors.py\nindex f3dd0f8..422a45f 100644\n(...TRUNCATED) | jyangballin/swesmith.x86_64.scrapy_1776_scrapy.35212ec5 |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | sunpy_11863 | "diff --git a/sunpy/map/maputils.py b/sunpy/map/maputils.py\nindex 0f425b7..0ba1f4c 100644\n--- a/su(...TRUNCATED) | all_corner_coords_from_map | sunpy/map/maputils.py | 0 | "# Dependency management issues with all_corner_coords_from_map\n\n## Description\n\nThere are depen(...TRUNCATED) | "diff --git a/sunpy/map/maputils.py b/sunpy/map/maputils.py\nindex 0f425b7..1a15c51 100644\n--- a/su(...TRUNCATED) | jyangballin/swesmith.x86_64.sunpy_1776_sunpy.f8edfd5c |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | trio_11346 | "diff --git a/src/trio/_socket.py b/src/trio/_socket.py\nindex 259992b..210f658 100644\n--- a/src/tr(...TRUNCATED) | _SocketType.sendto | src/trio/_socket.py | 0 | "Should `_SocketType.sendto()` be more testable through dependency injection?\n\n### Description\n\n(...TRUNCATED) | "diff --git a/src/trio/_socket.py b/src/trio/_socket.py\nindex 259992b..fa82cbd 100644\n--- a/src/tr(...TRUNCATED) | jyangballin/swesmith.x86_64.python-trio_1776_trio.cfbbe2c1 |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | dvc_13075 | "diff --git a/dvc/parsing/context.py b/dvc/parsing/context.py\nindex af8d1b3..3264ef4 100644\n--- a/(...TRUNCATED) | KeyNotInContext.__init__ | dvc/parsing/context.py | 0.75 | "# KeyNotInContext Performance Issue\n\nThere appears to be an issue with performance and efficiency(...TRUNCATED) | "diff --git a/dvc/parsing/context.py b/dvc/parsing/context.py\nindex af8d1b3..3d98091 100644\n--- a/(...TRUNCATED) | jyangballin/swesmith.x86_64.iterative_1776_dvc.1d6ea681 |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | pypika_11294 | "diff --git a/pypika/__init__.py b/pypika/__init__.py\nindex 66f564f..c00d467 100644\n--- a/pypika/_(...TRUNCATED) | Term.__truediv__ | pypika/terms.py | 0 | "# Reduce Add/RemoveIndex migration operations\n\nDescription\n\nWe need to optimize migration opera(...TRUNCATED) | "diff --git a/pypika/terms.py b/pypika/terms.py\nindex a277e1a..c4494d1 100644\n--- a/pypika/terms.p(...TRUNCATED) | jyangballin/swesmith.x86_64.kayak_1776_pypika.1c9646f0 |
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED) | trio_10018 | "diff --git a/src/trio/_core/_io_epoll.py b/src/trio/_core/_io_epoll.py\nindex 5e05f08..c35e5a4 1006(...TRUNCATED) | EpollIOManager.get_events | src/trio/_core/_io_epoll.py | 0.5 | "RuntimeWarning in EpollIOManager.get_events when using large file descriptors\n\n#### Description\n(...TRUNCATED) | "diff --git a/src/trio/_core/_io_epoll.py b/src/trio/_core/_io_epoll.py\nindex 5e05f08..ddffcd5 1006(...TRUNCATED) | jyangballin/swesmith.x86_64.python-trio_1776_trio.cfbbe2c1 |
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
This dataset contains 66337 trajectories. Data was generated from the second rollout of SVG on 121 SWE-smith codebases using GLM-4.5-Air as teacher and includes three SVG runs per function. Sera-4.5-Lite-T2 is a subset of this dataset and was used to train SERA-32B-GA.
Schema:
messages: Generated trajectory
instance_id: ID of trajectory
rollout_patch: Created patch to the codebase from the current trajectory
func_name: Name of function sampled from codebase to start the pipeline
func_path: File path to the sampled function
line_level_recall: Minimum patch verification threshold that is satisfied
problem_statement: Problem statement provided to the model
target_patch: Ground truth patch (empty if T1)
docker_image: Docker image used
Verification:
Verification can be done on T2 trajectories by comparing generated rollout patches against the target ground truth patch from T1 trajectories.
We do not verify in our main experiments but provide the metadata to do so in target_patch and rollout_patch.
Note: Apply json.loads() to the messages column to load.
Sera-4.5A-Full-T2 is licensed under the Open Data Commons Attribution License v1.0 (ODC-By). It is intended for research and educational use. For more information, please see our Responsible Use Guidelines.
- Downloads last month
- 116