Mass spectrometry and drug development – how the two come together
From the identification of a single protein to whole systems biology approaches, this technique is one of the most valuable tools and has also been used in examining protein-protein interactions1,2 and post-translational modifications, and looking at interactions with DNA, RNA and small molecules3.
One major advantage of MS over traditional methods is its enabling of multiplexing; many different proteins or metabolites can be identified and quantified in one analysis. By these possibilities MS has changed the way biologists look at the complexity of their systems. With the availability of record numbers of genome sequences, including many individual human genomes, experiments can be carried out to look at natural varieties and also the background of diseases. But this is still at its very beginning; while hundreds of thousands of publications introduce new biomarkers for disease, the outcome for these biomarkers is still extremely low, with only a few new ones approved yearly4-8. To overcome this, all disciplines, from basic biological research to medicine, and from bench to bedside, have to communicate and work together closely.
Another big challenge is the application of MS in drug development. For a long time MS during these early stages was mainly associated with chemistry, the integrity of compounds, whole libraries, drug metabolism and with preclinical or clinical pharmacokinetic studies. Use of MS to identify potential drug targets, i.e., proteins directly associated with a disease, via biological experiments, was primarily carried out in academic research environments. The main dogma was ‘one-target-one-drug’: one defined protein inhibited by one compound, generating the desired reaction in humans. In the industrial environment these targets are used to generate assays for the drug development.
These in-vitro assays are still state-of-the-art but they neglect the complexity of biological systems and a major drawback of this simplified approach is that side effects are not observed early on. The aim of drug development has to be to produce a safe drug with no or only minor side effects, no toxicity and preferably, for it to be personalised so that only those patients who will react in a positive, desired way will receive it. This requires knowledge of the target proteins and the reaction of the whole biological system on the drug treatment. Genome sequencing will allow determination of the reaction of an individual patient on the drug, but this still requires a lot of basic academic and clinical examinations.
Work that has been carried out in the last few years which is still ongoing today offers a much more global use of MS in drug development. One major issue was to address, early on, the potential side effects of drug molecules under development. A drug may pass the in vitro assay but not the consecutive tests in animals or even humans, leading to the high attrition rate. An early identification of potential severe side effects will help to overcome the high attrition rate as compounds that interact with ‘unwanted’ targets are dealt with prior to the next preclinical steps.
One example of how early identification of a severe side effect of a drug would have prevented human catastrophy if the technique applied would have been available 50 years ago, is the famous case of Contergan (thalidomide), marketed by German company Chemie Grünenthal from 1957 as a treatment for morning sickness in expectant mothers and famous for causing phocomelia in over 10,000 children. Only half of these children survived. Using the chemical proteomics approach, a Japanese group only discovered the reason for this severe side effect in 20109,10. Today, thalidomide is back as the state-of-the-art anti-leprosy drug11,12, but still carries the high risk of phocomelia for unborn children.
Emergence of chemical proteomics
Proteomics is the general approach to identifying and quantifying all proteins of a cell, a tissue, an organ or the whole organism at a defined state. Mark Wilkins introduced the word proteomics in 1994, almost 10 years before the first human genome was sequenced. Enabling the aforementioned identification and quantification of proteins in the absence or presence of drugs, proteomics helps improve our understanding of the interactions of such drugs with given proteins. And with the availability of human genome sequences, as well as those of many other species, proteomics has become the state-of-the-art method to study biological systems, with proteins identified based on their unique amino acid sequences.
To understand the actions of drugs and other small molecules on a biological system, the targeted proteins have to be identified. Chemical proteomics is the unbiased method for identifying the medium to strong binding proteins to any small molecule of interest with biological action13,14. In this approach, the drug molecule is covalently bound to a matrix and lysates of biological sources are passed over it. Proteins binding to the drug molecule are retained, the non- or weakly-binding proteins are removed and the bound proteins then identified by MS. This approach does not only identify the proteins the drug is aimed to target, but also all proteins that bind the drug with or without biological consequences.
The major challenge with this method is adequately immobilising the small molecule, which can be achieved via a linker, a linear molecule that should not interfere with the binding to the target protein. Linkers have to be attached at different positions of the compound, which requires skilled chemistry resources and which can fail in the case of natural compounds. However, linkers at different positions can also reveal binding to different proteins as was shown for two Parkinson’s disease drugs15. Nevertheless, this simple chemical proteomics approach can be expanded to competition experiments, identifying new compounds binding to the proteins. The lysate is pre-incubated with the unmodified compound of interest and then applied to the immobilised compounds targeting specific protein classes, for example kinases16. If the soluble compound competes with the binding to the immobilised compound, the protein is no longer captured. This competition can be determined by MS, leading to quantitative information about the binding affinities. This approach has been used to study drugs in clinics, such as major clinical kinase inhibitors to treat cancer.
Numerous works have been published on chemical proteomics, making it a major tool in drug development and in early identification of potential off-targets causing side effects and toxicity. The interest of the pharmaceutical industry is reflected by the incorporation of Cellzome (Heidelberg, Germany), a company based on chemical proteomics, into GlaxoSmithKline.
To overcome the limitations of chemical proteomics, developments have been made within the area of thermal shift assays. Here, cell lysates or intact cells are incubated with the compounds of interest and binding proteins are identified as those that do not denature under increased temperature17,18. Simple centrifugation sediments the denatured proteins and MS allows for the identification of the remaining proteins. This approach is totally unbiased; whatever protein binds to the small molecule can be identified. It opens new avenues for drug development and allows one to identify the targets of natural compounds.
Fragment-based drug discovery (FBDD) is a relative new approach to developing new drugs. While high-throughput screening approaches have often been tried to identify new drugs, with millions of compounds in the molecular range of 500 Da with micro- to nanomolar affinities, weakly binding molecules are not found by most of these approaches. In FBDD a very limited number of molecules with molecular weights up to 200 Da with millimolar affinities are docked onto the protein to identify weak binders, often using computational approaches or biological assays. These are then combined with larger molecules and finally to the active drug. This approach requires a defined protein target, unlike the approach described above.
The group of S. Ohlson in Sweden has developed a method to identify weakly binding fragments: weak-affinity chromatography19,20. Here, the target protein is immobilised onto a matrix in a high-performance liquid chromatography column; a mixture of fragments is injected onto this column, no-binders elute in the void volume and weak binders are retained. Using high-resolution MS as detector systems these can be identified. This is a multiplexed method; more than 50 fragments can be analysed together.
These are only a few examples how modern MS can be involved in drug discovery and development. The major challenge is still the complexity of all biological systems. The human genome reveals that there are around 22,000 genes coding for proteins. But the final active proteoforms, i.e., the proteins with a defined amino acid sequence and a specific set of post-translational modifications or isoforms, mutations and polymorphisms, can climb up to numbers of between 1,000,000 and 1,000,000,000 different forms, and only very few examples have been evaluated21,22. Nobody knows what the final number is yet. Out of the many proteoforms based on one protein sequence, only a few may be binding to a drug and show desired reactions. This still requires basic and applied research where MS will be a major tool, if the correct questions are asked, the experiments are well designed and the required controls are taken.