By Ivan Korendovych

Directed evolution is a powerful tool for improving existing properties and imparting completely new functionalities to proteins. Nonetheless, its potential in even small proteins is inherently limited by the astronomical number of possible amino acid sequences. Sampling the complete sequence space of a 100-residue protein would require testing of 20100 combinations, which is beyond any existing experimental approach. In practice, selective modification of relatively few residues is sufficient for efficient improvement, functional enhancement and repurposing of existing proteins. Moreover, computational methods have been developed to predict the locations and, in certain cases, identities of potentially productive mutations. Importantly, all current approaches for prediction of hot spots and productive mutations rely heavily on structural information and/or bioinformatics, which is not always available for proteins of interest. Moreover, they offer a limited ability to identify beneficial mutations far from the active site, even though such changes may markedly improve the catalytic properties of an enzyme. Machine learning methods have recently showed promise in predicting productive mutations, but they frequently require large, high-quality training datasets, which are difficult to obtain in directed evolution experiments. Here we show that mutagenic hot spots in enzymes can be identified using NMR spectroscopy. In a proof-of-concept study, we converted myoglobin, a non-enzymatic oxygen storage protein, into a highly efficient Kemp eliminase using only three mutations. The observed levels of catalytic efficiency exceed those of proteins designed using current approaches and are similar with those of natural enzymes for the reactions that they are evolved to catalyze. Given the simplicity of this experimental approach, which requires no a priori structural or bioinformatic knowledge, we expect it to be widely applicable and to enable the full potential of directed enzyme evolution.