Agentless \scalerel*C: Demystifying LLM-based Software Engineering Agents
๐ Abstract
The article discusses Agentless, an agentless approach to automatically solve software development problems. It highlights the limitations of complex autonomous agent-based approaches and proposes a simple two-phase process of localization followed by repair, without letting the large language model (LLM) decide future actions or operate with complex tools. The evaluation on the SWE-bench Lite benchmark shows that Agentless achieves the highest performance among open-source approaches while incurring the lowest cost. The article also conducts a detailed analysis of the SWE-bench Lite dataset, identifying issues such as problems with exact ground truth patches or insufficient/misleading issue descriptions, and constructs a more rigorous SWE-bench Lite- benchmark.
๐ Q&A
[01] Agentless Approach
1. What are the key components of the Agentless approach?
- Agentless follows a simple two-phase process: localization and repair.
- In the localization phase, Agentless uses a hierarchical process to first localize the fault to specific files, then to relevant classes or functions, and finally to fine-grained edit locations.
- In the repair phase, Agentless generates multiple candidate patches in a simple diff format, filters out any patches with syntax errors or that cannot pass the previous tests, and selects the top-ranked patch using majority voting.
2. How does Agentless differ from prior agent-based approaches?
- Agentless deliberately disallows the LLM from autonomous tool usage or decision planning, unlike prior agent-based approaches that equip the LLM with various tools and allow it to iteratively perform actions and plan future steps.
- Agentless has a simplistic and straightforward design that can be easily understood, avoiding the limitations of LLM agents in software development, such as complex tool usage/design, lack of control in decision planning, and limited ability to self-reflect.
3. What are the key advantages of the Agentless approach?
- Agentless achieves the highest performance (27.33%) among all open-source approaches on the SWE-bench Lite benchmark.
- Agentless incurs the lowest cost ($0.34) compared to prior agent-based approaches.
- Agentless demonstrates the overlooked potential of a simple, interpretable technique in autonomous software development.
[02] Analysis of SWE-bench Lite
1. What issues did the authors identify in the SWE-bench Lite dataset?
- The authors found that SWE-bench Lite contains problems (4.3%) with exact ground truth patches in the description, problems (9.3%) with missing critical information needed to solve the issue, and problems (4.3%) that include misleading solutions in the issue description.
2. How did the authors address these issues?
- The authors constructed SWE-bench Lite-, a more rigorous benchmark that excludes the problematic problems identified in the original SWE-bench Lite dataset.
- The authors hope that SWE-bench Lite- can serve as a better evaluation platform to assess the true capabilities of autonomous software development tools.
3. What insights did the authors gain from the problem classification?
- The authors found that prior agent-based approaches perform better on problems where the location information or solution steps are provided in the issue description, but struggle more on problems without such helpful information.
- In contrast, the simple Agentless approach performs comparably to the closed-source agent-based tools even on the more challenging problems without location or solution clues.