Dataset Viewer (First 5GB)
Auto-converted to Parquet Duplicate
messages
stringlengths
84.3k
915k
instance_id
stringlengths
9
30
rollout_patch
stringlengths
302
1.54M
func_name
stringlengths
1
81
func_path
stringlengths
10
88
problem_statement
stringlengths
190
851
target_patch
stringclasses
1 value
docker_image
stringclasses
121 values
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
feedparser_10156
"diff --git a/feedparser/namespaces/_base.py b/feedparser/namespaces/_base.py\nindex 6299b02..08bce1(...TRUNCATED)
Namespace._end_webmaster
feedparser/namespaces/_base.py
"There appears to be an issue with resource cleanup and lifecycle management related to Namespace._e(...TRUNCATED)
jyangballin/swesmith.x86_64.kurtmckee_1776_feedparser.cad965a3
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
python-pptx_10153
"diff --git a/src/pptx/oxml/__init__.py b/src/pptx/oxml/__init__.py\nindex 21afaa92..fe0d54c3 100644(...TRUNCATED)
register_element_cls
src/pptx/oxml/__init__.py
"There appears to be an issue with code efficiency and resource utilization related to register_elem(...TRUNCATED)
jyangballin/swesmith.x86_64.scanny_1776_python-pptx.278b47b1
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
conan_10109
"diff --git a/conan/tools/google/bazeldeps.py b/conan/tools/google/bazeldeps.py\nindex fa73a9f..ccc0(...TRUNCATED)
_get_includes
conan/tools/google/bazeldeps.py
"There appears to be an issue with code efficiency and resource utilization related to _get_includes(...TRUNCATED)
jyangballin/swesmith.x86_64.conan-io_1776_conan.86f29e13
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
MONAI_16593
"diff --git a/monai/transforms/smooth_field/array.py b/monai/transforms/smooth_field/array.py\nindex(...TRUNCATED)
RandSmoothDeform.randomize
monai/transforms/smooth_field/array.py
"There appears to be an issue with code input/output handling and data flow related to RandSmoothDef(...TRUNCATED)
jyangballin/swesmith.x86_64.project-monai_1776_monai.a09c1f08
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
dask_14923
"diff --git a/dask/array/routines.py b/dask/array/routines.py\nindex dc30e91..ecdb76a 100644\n--- a/(...TRUNCATED)
atleast_2d
dask/array/routines.py
"There appears to be an issue with code consistency and convention adherence related to atleast_2d i(...TRUNCATED)
jyangballin/swesmith.x86_64.dask_1776_dask.5f61e423
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
dvc_11523
"diff --git a/dvc/analytics.py b/dvc/analytics.py\nindex 6fbb7a9..f4dedec 100644\n--- a/dvc/analytic(...TRUNCATED)
send
dvc/analytics.py
"There appears to be an issue with testing and observability related to send in dvc/analytics.py.\nW(...TRUNCATED)
jyangballin/swesmith.x86_64.iterative_1776_dvc.1d6ea681
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
MONAI_14689
"diff --git a/monai/transforms/regularization/array.py b/monai/transforms/regularization/array.py\ni(...TRUNCATED)
CutOut.apply
monai/transforms/regularization/array.py
"There appears to be an issue with code scalability and extensibility related to CutOut.apply in mon(...TRUNCATED)
jyangballin/swesmith.x86_64.project-monai_1776_monai.a09c1f08
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
nikola_11875
"diff --git a/nikola/nikola.py b/nikola/nikola.py\nindex 6f80377..875ea06 100644\n--- a/nikola/nikol(...TRUNCATED)
Nikola.link
nikola/nikola.py
"There appears to be an issue with dependency management and module coupling related to Nikola.link (...TRUNCATED)
jyangballin/swesmith.x86_64.getnikola_1776_nikola.0f4c230e
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
pandas_14855
"diff --git a/pandas/core/frame.py b/pandas/core/frame.py\nindex 28485bf5fd..ceb9d4d7d7 100644\n--- (...TRUNCATED)
DataFrame.dot
pandas/core/frame.py
"There appears to be an issue with code side effects and function purity related to DataFrame.dot in(...TRUNCATED)
jyangballin/swesmith.x86_64.pandas-dev_1776_pandas.95280573
"[{\"role\": \"system\", \"content\": \"You are a helpful assistant that can interact with a compute(...TRUNCATED)
pydicom_11617
"diff --git a/src/pydicom/data/download.py b/src/pydicom/data/download.py\nindex 30d6c84..17c982a 10(...TRUNCATED)
data_path_with_download
src/pydicom/data/download.py
"There appears to be an issue with data transformation and processing logic related to data_path_wit(...TRUNCATED)
jyangballin/swesmith.x86_64.pydicom_1776_pydicom.7d361b3d
End of preview. Expand in Data Studio

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

This dataset contains 36607 trajectories. Data was generated from the first rollout of SVG on 121 SWE-smith codebases using GLM-4.5-Air as teacher and includes one SVG runs per function.

Schema:

messages: Generated trajectory
instance_id: ID of trajectory
rollout_patch: Created patch to the codebase from the current trajectory
func_name: Name of function sampled from codebase to start the pipeline
func_path: File path to the sampled function
problem_statement: Problem statement provided to the model
target_patch: Ground truth patch (empty if T1) 
docker_image: Docker image used

Verification:
There is no verification for rollout one.

Sera-4.5A-Lite-T1 is licensed under the Open Data Commons Attribution License v1.0 (ODC-By). It is intended for research and educational use. For more information, please see our Responsible Use Guidelines.

Downloads last month
62

Collection including allenai/Sera-4.5A-Lite-T1