Protein engineering is a field with wide-ranging applications in chemistry, energy, and medicine. However, the current methods used to engineer new proteins with improved or novel functions are slow, labor-intensive, and inefficient. This inefficiency hampers the ability to exploit the potential of protein engineering in various scientific and medical fields.
In protein engineering, a discovery-driven process is followed, involving the generation of hypotheses, design and execution of experiments, and interpretation of data to enhance the understanding of biological systems. However, this process is iterative and often takes years to complete, making it inefficient. To address this, autonomous systems such as robot scientists and self-driving laboratories have been integrated into various areas, including gene identification, chemical synthesis methodologies, and the discovery of new materials. These autonomous systems have the ability to learn from diverse data sources, make decisions under uncertainty, and generate reproducible data, showing promise in protein engineering and synthetic biology.
A team of researchers at the University of Wisconsin–Madison has developed the Self-driving Autonomous Machines for Protein Landscape Exploration (SAMPLE) platform, which represents an innovative approach to autonomous protein engineering. The SAMPLE platform consists of an intelligent agent and a fully automated robotic system that collaborate to enhance protein engineering. The agent designs new proteins and learns protein sequence-function relationships, while the robotic system conducts experiments and provides feedback.
The researchers conducted 10,000 simulated protein engineering trials using cytochrome P450 data to evaluate the effectiveness of the SAMPLE platform. They employed various Bayesian optimization (BO) methods, including UCB positive, Expected UCB, standard UCB, and random approaches, to select protein sequences for testing. The thermostability of the engineered proteins was used to gauge the effectiveness of these methods. The study also explored batch testing and found a slight advantage in smaller batch experiments. The SAMPLE platform relies on a Gaussian Process (GP) model, which is trained on sequence-function data and guides the agent’s design decisions. Robustness and reliability are ensured through multiple layers of exception handling and data quality control for failed experimental steps.
The SAMPLE platform successfully identified glycoside hydrolase enzymes that exhibited significantly greater stability than the initial sequences, with at least a 12°C increase in thermal tolerance. The agents efficiently explored less than 2% of the full combinatorial landscape before converging on the most stable designs. Although the top sequences identified by each agent were unique, they converged to the same region in the fitness landscape, suggesting the attainment of the global fitness peak. Human characterization of these machine-designed proteins confirmed their enhanced thermostability and maintained catalytic activity.
In conclusion, the SAMPLE platform represents a significant advancement in protein engineering, demonstrating the potential of self-driving laboratories to automate and accelerate scientific discovery. The platform’s full autonomy, integration of learning, decision-making, and experimentation, marks a major leap over previous semi-autonomous systems. It emphasizes the efficiency and potential of using machine learning and automation in protein engineering. This methodical approach underscores the synergy of intelligent computational design, automated experimentation, and precise data management in protein engineering advancements.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.
If you like our work, you will love our newsletter.
Don’t forget to join our Telegram Channel.
Nikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.