Written by: Heart of the Metaverse

EvolutionaryScale, a cutting-edge artificial intelligence research laboratory in biology, recently announced that it has received over $142 million in seed round financing and released the milestone AI model ESM3. What unique ideas does this company, which was founded only a year ago, have in the field of AI life sciences? What technological breakthroughs does the new protein model have?

A week ago, when Meta was sweeping the Vincent video track, the protein team EvolutionaryScale, which was disbanded by Meta, received over $142 million in seed round financing, which can be said to be ridiculously high in the entire biotechnology field.

Last August, Meta officially announced the dissolution of its protein folding team Meta-FAIR. This pure "science + AI" project could not bring Meta quick profits, and Meta's decision to focus on commercial AI seemed reasonable.

However, this underappreciated team actually slapped Meta in the face in just one year. Their latest ESM3 is considered a landmark generative AI model in the field of biology, opening up new possibilities for biological programming.

01. 1 minute project overview

1. Project Name: EvolutionaryScale

2. Establishment date: July 2023

3. Product Introduction:

Developing a large language model for creating new proteins and other biological systems - ESM, which has now been iterated to ESM-3.

4. Founding team:

  • Chief Scientist: Alexander Rives (PhD in Computer Science from New York University, former Facebook AI scientist)

  • Tom Sercu

  • Sal Candido

5. Financing situation:

On June 25, 2024, a seed round of financing of up to US$142 million was completed. The financing was led by Nat Friedman and Daniel Gross and Lux ​​Capital, with participation from Amazon, NVentures (NVIDIA's venture capital arm) and angel investors.

02. Teamwork and consistent pursuit of ideas

Advances in artificial intelligence have created unprecedented opportunities for biological science research, including the design of functional biomolecules, especially proteins. Applying artificial intelligence to protein design can not only improve the efficiency and success rate of protein design, but also help humans solve some of the challenges they are facing by quickly responding to infectious disease outbreaks.

Alexander Rives and others saw the gap in protein design and decided to develop a large model based on deep learning to promote industrial-level protein design into the "era of fully automatic intelligent generation."

Thus, EvolutionaryScale was born. It is a cutting-edge AI research laboratory focusing on the field of biological sciences, dedicated to launching large language models at the forefront of biology.

Interestingly, all eight members of the company's founding team came from Meta's FAIR (Fundamental Artificial Intelligence Research) department. Despite the setbacks from the world-class social media giant, the core members of the initial team did not give up, but quickly invested in new battlefields and began to develop the next generation of models based on the results of the original team.

EvolutionaryScale's large models support research and development in fields such as health and environmental science, constantly exploring the scalability of biology, and powering breakthrough scientific research. One of the most significant results is the breakthrough in protein folding technology. ESM models have revealed the structures of hundreds of millions of metagenomic proteins, helping scientists around the world to simulate and understand proteins.

EvolutionaryScale aims to guide the development of artificial intelligence technologies in the field of protein design through open and secure research methods.

On this basis, the company, as a signatory, led more than 160 global stakeholders from academia, government and civil society to jointly develop this technology, ensure its safety and reliability, and thus achieve the vision of benefiting human health and society.

It is precisely because of the sense of responsibility to lead the advanced AI technology in the biological community that Alexander Rives and his team never stop.

Previously, EvolutionaryScale released a large language model, ESM1, which is considered the first transformer language model for proteins and was built by the founding team of EvolutionaryScale while working in the FAIR department of Meta. ESM2, an upgraded model of ESM1, has 15 million parameters and performs better than the old model ESM1b (which has 650 million parameters).

Last week, EvolutionaryScale released its latest ESM3 AI model, a big step toward the future of biology. With the capabilities of this model, it is possible to accelerate discoveries with broad applications, from creating proteins that help capture carbon to developing new cancer treatments.

03. Pioneer in the application of AI in biology

ESM3 is a generative AI model whose main function is to generate new proteins. The model uses deep learning technology and a large amount of protein data for training to learn the relationship between protein sequence, structure and function.

ESM3 was trained using more than 1 trillion teraflops of computing power, the largest known computational scale in biology. It was trained on a dataset of 2.78 billion proteins from Earth’s natural diversity, enabling it to simultaneously reason about a protein’s sequence, structure, and function.

The main workflow of ESM3 can be summarized into the following four steps:

  • Data collection and processing: EvolutionaryScale will first collect a large amount of biological data from various sources, including gene sequences, protein structures, functional annotations, etc. These data will be cleaned, standardized and formatted for subsequent analysis and application.

  • Model training: Using deep learning algorithms and massive computing resources, EvolutionaryScale trains the processed data to build large language models that can understand and predict biological laws. These models are not only highly accurate, but also capable of handling complex biological problems.

  • Generating new proteins: Through interactive prompts, ESM3 is able to generate new proteins that would have taken hundreds of millions of years to evolve in nature.

  • Scientific Validation: The generated novel proteins will be validated through scientific experiments to determine their functions and potential applications.

Currently, one of the most notable use cases of ESM3 is the generation of a new green fluorescent protein (GFP).

GFP is one of the most beautiful and unique proteins in nature, responsible for the glow of jellyfish and the bright fluorescent colors of corals. ESM3 created this new fluorescent protein through a series of thought processes that spanned 500 million years of evolution. This process would have taken more than 500 million years in natural evolution, but ESM3 achieved this leap through computational methods.

The release of ESM3 has also revolutionized the fields of drug discovery and synthetic biology.

In terms of drug discovery, ESM3 can generate new proteins with specific biological activities, providing more candidate molecules for drug screening and optimization. At the same time, ESM3 can also predict and optimize the interaction mechanism between drugs and targets, providing a more scientific basis for drug design and development.

In terms of synthetic biology, ESM3 can generate biological systems with specific functions, providing new solutions for fields such as biomanufacturing and bioenergy. For example, ESM3 can generate enzyme systems that efficiently convert carbon dioxide into organic matter, providing a new way for carbon capture and utilization.

EvolutionaryScale's ESM3 model represents a new milestone in AI in biology. Through its powerful generative capabilities and collaboration with industry leaders, ESM3 is expected to accelerate the discovery of novel proteins and the design of biological systems, bringing revolutionary impacts to future drug development, materials science, environmental science and other fields.

04. Innovation journey in the field of biology

Synthetic biology: programming life

Synthetic biology is an important direction for the future development of EvolutionaryScale. By designing and synthesizing new gene circuits and biological pathways, scientists can create organisms with specific functions.

  • Gene circuits are similar to electronic circuits, but they control biological processes in cells.

Genetic circuits enable precise control of the expression of specific genes within cells. For example, a genetic circuit can be designed to turn the expression of a specific gene on or off when a cell detects a specific signal, such as a chemical or environmental change.

  • Synthetic biology pathways involve the combination of multiple enzymes and metabolic pathways to produce valuable compounds.

Through AI analysis and design, scientists can create new metabolic pathways that enable organisms to synthesize compounds that cannot be produced under natural conditions. For example, by redesigning the metabolic pathways of microorganisms, microorganisms can produce pharmaceutical intermediates, biofuels or industrial chemicals.

  • A cell factory is a biological system that uses genetic engineering to modify microorganisms so that they can efficiently produce target products under industrial conditions.

Through AI-assisted design, scientists can modify the genome of microorganisms to show excellent production performance under specific conditions. For example, by editing the genes of yeast or bacteria, scientists can make these microorganisms efficiently produce antibiotics, enzymes or other biological products.

If this technology can continue to develop, it will not only promote the frontiers of scientific research, but also bring important application prospects to fields such as medicine, environmental protection and agriculture.

Data-driven personalized medicine

EvolutionaryScale is driving advancements in personalized medicine through AI and big data analytics, providing patients with more accurate and efficient medical services.

Personalized medicine is about tailoring the most appropriate treatment plan based on each patient's unique biological information and clinical data. One key area is genomic analysis. By fully sequencing and analyzing a patient's genome, scientists can identify genetic variants associated with disease.

EvolutionaryScale uses AI technology to quickly and accurately analyze large amounts of genomic data to identify potential disease risk factors.

This method can help doctors make diagnoses at an early stage of the disease and take preventive measures. For example, by analyzing the BRCA1 and BRCA2 gene mutations in breast cancer patients, their risk of disease can be predicted, allowing for early screening and intervention.

Today, EvolutionaryScale is at the forefront of the integration of biology and artificial intelligence, and is committed to programming and optimizing biological systems through continuous innovation and exploration. More technological breakthroughs may be achieved in the future, creating a smarter and healthier future for mankind.