By Arvin Moser
A very important question that natural products chemists face is whether the newly isolated compound they have is truly novel or already known. This process, commonly referred to as dereplication, is also performed by people doing competitive product analysis, drug counterfeit analysis, reaction discovery, etc. There are several methods used to accomplish dereplication, including a comparison of the retention time and observed molecular ion from LC-MS, and a comparison of the NMR spectrum to spectral/structure libraries. When using NMR data in dereplication, 13C NMR spectra are preferred [1] in most cases, as they provide a very clear fingerprint of the compound’s carbon skeleton. We have previously explored the benefits of using databases with predicted spectra of compounds. Currently, there are several options in this respect, both commercial [2] and freely available [3-4]. In this poster, we explore the capabilities and requirements of such systems.
In order to establish a starting point, we selected 56 compounds from the Aldrich library of FT-NMR spectra, with a molecular weight (MW) range of 150-800, where most pharmaceutically active compounds are found. We then searched the library containing the predicted 13C spectra of these compounds using the observed experimental peak frequencies together with the MW.
We explored the search options with respect to inclusion/exclusion of the MW information, together with the requirement to accept or reject hits with extra or missing peaks in the experimental spectrum. We saw that the MW information is essential as it provides a very clear starting point and deals effectively with symmetric compounds. We also saw that as the MW increases, the uncertainty and the number of missing or extra peaks increase as well. However, with careful adjustment of the search parameters, the correct result can be identified within a few seconds.
Detailed results will be presented, alongside an optimized workflow that allows one to unambiguously find the correct structures in the database in all cases.
1. R.B. Williams, M. O’Neill-Johnson, A.J. Williams, P. Wheeler, R. Pol, A. Moser., Org. Biomol. Chem. 2015, 13, 9957–9962.
2. D. Argyropoulos, S. Golotvin, R. Pol, A. Moser, N. Ortel, S. Breinlinger, T. Chilzuk and T. Niedermeyer, “Efficient Dereplication of Natural Products Using Predicted 13C Spectra” presented at 60th ENC, Poster 004, 2019.
3. H. Kalchhauser, W. Robien, Chem. Inf. Comput. Sci. 1985, 25 (2), 103–108
4. R. Reher, H. W. Kim, C. Zhang et al. J. Am. Chem. Soc. 2020, 142 (9), 4114–4120