Ex-Meta scientists debut gigantic AI protein design model
๐ Abstract
The article discusses how AI tools are being used to design entirely new proteins that could transform medicine. It focuses on the work of EvolutionaryScale, a company that has developed a powerful protein language model called ESM3, which can be used to create new proteins to specifications provided by users.
๐ Q&A
[01] AI Tools Designing New Proteins
1. What is the key capability of the AI tool ESM3 developed by EvolutionaryScale?
- ESM3 is a protein language model that was trained on over 2.7 billion protein sequences and structures, as well as information about their functions.
- It can be used to create new proteins to specifications provided by users, similar to how chatbots like ChatGPT generate text.
2. What are some examples of how ESM3 has been used so far?
- The team used ESM3 to create new fluorescent proteins that are similar to but different from the well-known green fluorescent protein (GFP).
- Other teams have used earlier versions of the ESM model to design improved antibodies and re-engineer anti-CRISPR proteins.
3. What are the potential applications of AI-designed proteins?
- The article mentions potential applications in drug development, sustainability (e.g. designing plastic-eating enzymes), and the development of antibodies and other protein-based drugs.
[02] Concerns and Considerations
1. What concerns were raised about the way the article describes the AI-designed proteins?
- Computational biologist Anthony Gitter expressed concern that describing the AI-designed proteins as equivalent to "over 500 million years of evolution" is an unhelpful and potentially misleading comparison that could "hurt the field and be dangerous for the public."
2. What safety measures have been taken with the largest version of the ESM3 model?
- The largest version of ESM3, comprising nearly 100 billion parameters, is not publicly available.
- For the smaller open-source version, certain sequences from viruses and a U.S. government list of concerning pathogens and toxins were excluded from the training data.
- EvolutionaryScale has also been in touch with the U.S. Office of Science and Technology Policy regarding the large model, as required by a 2023 presidential executive order.
3. What are the limitations in replicating the largest ESM3 model?
- Structural biologist Martin Pacesa noted that the largest ESM3 model would require immense computing resources to develop independently, and that no academic lab would be able to replicate it.