SWE-bench Lite — Eval Explorer

300 instances · UMAP on edit-operation distances · 4 agent models
Color by
Filter
Legend
Pass rate by fix type
Selected instance
Click any point to inspect