Working with AI (Part 2): Code Conversion
๐ Abstract
The article discusses how Mantle, a software development company, leveraged large language models (LLMs) to streamline the process of converting a prototype project into a production-ready codebase. The key points covered include:
- The common challenge of converting a prototype project into a production-ready codebase, and the reasons why organizations often undertake such efforts.
- Mantle's approach of using an LLM to translate the prototype code written in R into their standard production tech stack of Golang and ReactJS.
- The techniques used to provide the LLM with the necessary context, including inserting the existing prototype codebase, summarizing the target architecture and libraries, and incorporating screenshots of the existing application.
- The iterative process of generating code file-by-file, starting from backend to frontend and leaf node files, and the challenges around output token limitations.
- The benefits of this approach, including saving two-thirds of the time needed for the code conversion task.
๐ Q&A
[01] Mantle's Approach to Code Conversion
1. What was Mantle's goal in using an LLM for the code conversion task? Mantle's goal was not to achieve 100% perfectly crafted code, but to get 80% of the boilerplate and repeated patterns out of the way so that engineers could focus on the 20% - high value polish to ship the project to customers.
2. How did Mantle provide context to the LLM to ensure the generated code aligned with their production code patterns and libraries? Mantle provided the LLM with various forms of context, including:
- The existing prototype codebase
- A summary of the target architecture and code patterns
- The specific libraries available in their existing codebases (via go.mod and package.json files)
- Screenshots of the existing application to help with frontend code generation
3. What was Mantle's approach to generating the code file-by-file? Mantle found that the best approach was to generate the code from backend to frontend, starting with leaf node (bottom-up tree traversal) files like utilities, libraries, and database layer, and then moving up to more common and connected files like API interfaces or routing. This allowed the LLM to better understand the relationships between the files.
[02] Challenges and Lessons Learned
1. What were some of the challenges Mantle faced with the output token limitations? Mantle found that due to the output token limitations (often around 8,192 tokens), they had to generate files one-by-one or sometimes section-by-section. For files that exceeded the token limit, they adjusted their approach to generate partial file segments that could be concatenated together.
2. What was the importance of reviewing and adjusting the generated code output? Mantle emphasized the importance of reviewing and adjusting the generated code output, as it was crucial to catch any patterns early on before they compounded and multiplied. This manual review and adjustment step helped ensure the quality of the generated code.
3. How did Mantle's approach evolve as token windows and model capabilities improved? The article suggests that as token windows continue to grow and models become more adept at understanding and generating code, there will be even greater efficiencies and improved code quality in the conversion process. This will likely lead to more rapid and cost-effective software development in the future.