Proteomics 101: A primer on an emerging life sciences sector
Proteomics is the study of proteins (the proteome) in a cell, tissue, or organism. This emerging sector of the life sciences is helping scientists gain a deeper understanding of molecular biology. Proteins are large, complex molecules present in all living organisms. They’re created from DNA instructions and have a wide range of functions in cells. Just as genomics (the study of all the DNA in an organism) has been instrumental in helping scientists understand the link between genes and disease, proteomics can do the same at the level of the protein.
The recent decline in the cost of sequencing DNA, combined with the power of cloud computing, has allowed for an explosion of new tools created to analyze DNA, and therefore proteins. This analysis yields a tremendous amount of data requiring computational tools as well. There are now multiple companies that have developed tools for the analysis of the biology and of the data. This is absolutely critical for the discovery of new drug targets for disease.
“If we can build computational tools that help increase humanity's understanding of biology, it could have a massive positive impact on drug discovery and bio-engineering,” says Lucas Siow, Co-founder and CEO at ProteinQure, a startup that is building a computational platform for protein drug discovery.
The scientific and medical community have known that proteins provide important information about human biology and disease, but until recently it has been impossible to measure the proteome. Proteins have already been successfully used in the treatment and detection of certain diseases. For example, proteins are used to detect disease by looking for a specific protein that is a biomarker for a disease (such as the antigen tests that detect specific proteins from the coronavirus). They have also been used to treat disease, such as insulin medication for treating diabetes. These applications are just the beginning of what we can do with proteins. We are now on the verge of unlocking the potential of proteins as a tool.
“Roughly 10 percent of human diseases are targetable by small molecules, but for the vast majority, you actually need a protein,” Siow says. “Their large size makes them much more complicated in shape, which is what makes them hard to make, but that allows them to impact biology in so many more unique ways.”
This burgeoning industry is gaining momentum and The Business Research Company included proteomics on its list of the major markets expected to emerge after COVID-19. The next decade could be a crucial turning point. The Business Research Company expects the global proteomics market to grow from $21.34 billion in 2020 to $39.80 billion in 2025 at a rate of 13.3%. The market is then expected to reach $73.25 billion in 2030.
General schema showing the relationships of the
genome, transcriptome, proteome, and
metabolome (lipidome). Source: Wikipedia.
Over the last 20 years, the time and cost of genomics dropped dramatically. (Read our Synbio Primer for more on synthesizing DNA.) Proteomics, on the other hand, is progressing at a slower pace. The quantity, variability and complexity of proteins make proteomics an even greater challenge than genomics.
DNA encodes for building blocks of proteins called amino acids. Three amino acids encode for a protein, but there are multiple trios that encode for the same protein, which contributes to the dynamism, and complexity, of the proteome.
The codon wheel above can be used to translate
DNA codons into amino acids.
Image credit: Genome Research Limited
DNA is also very static - it is faithfully replicated from cell to cell over the lifetime of an organism and it barely changes its composition, except in the notable case of a genetic mutation, which often causes disease. On the other hand, proteins are much more dynamic. This is the main reason it has been so challenging to map the proteome the way we have mapped the genome. Once they are produced, proteins can be continually modified with “post-translational modifications” (PTMs) based on environmental factors. Thanks to PTMs, any given protein’s structure and function are fluid, differing over time and from cell to cell, which is in stark contrast to DNA’s staticity.
Because of this inherent complexity, it has been challenging to map the proteome and to make sense of the resulting data. But this complexity is exactly why proteomics is worth pursuing. It is a technologically challenging task with tremendous potential for diagnosing disease and identifying targets for treatment.
Now there are more computational tools for life sciences. The recent advances in genomics, plus access to computational power and cloud computing are making it easier to measure, engineer, and synthesize the large molecules that could treat many diseases.
The applications for proteomics are just getting started. In addition to the current protein-based diagnostics (such as finding proteins that identify or quantify a disease), we’re now starting to unlock the ability to find a much wider variety of proteins. Proteomics-based tests could replace older diagnostic tests that are often invasive and expensive. Much like the way genomics has improved both fundamental research and personalized medicine, mapping the proteome could provide new insights into health and disease for individual patients and the overall population.
This deeper understanding of proteins will open new opportunities for drug development. Proteomic techniques enable us to better understand the nature of protein-drug interactions and even create novel proteins to target specific diseases, which helps scientists to develop even better drugs to treat unmet medical needs. Plus, being able to identify and engineer proteins will extend beyond therapeutic approaches to improve food and agriculture and animal health.
Proteomics is a new industry with many potential applications. The early players are all taking different technological approaches to solve the problem of how to analyze the proteome. The answer could involve machine learning, computer simulations, and a variety of algorithms.
“The problem is so massive and valuable and we don’t know the right path yet,” says Siow. “I think different approaches will make sense for different cases and there will be many winners.”