Benchmarking of different software for 2D NMR spectra automatic integration for metabolomic approaches
Comparaison de différents logiciels pour l'intégration automatique des spectres RMN 2D pour les approches métabolomiques
Abstract
NMR-based metabolomic studies are mostly performed with proton 1D liquid-state NMR, using well-established protocols for (bio)fluids or extracts. Proton 1D NMR is rapid and robust but may be limited by an extensive signal overlap, which could impair the accurate identification and quantification of biomarkers.
To overcome this limitation, 2D NMR experiments have been shown useful to improve the resolution and metabolite identification. Fast 2D NMR pulse sequences (Ultra-fast and non-uniform sampling approaches) have already shown great potential for metabolomic studies when combined with bucketing approaches and statistical analysis [1][2].
The manual integration currently used in 2D NMR is not feasible for large-scale metabolomics, because it is time-consuming and it impacts the reproducibility.
Although optimized software tools already exist for 1D NMR spectra bucketing, available tools for automatic integration of 2D NMR spectra have limitations (format, size of spectra, number of peaks detected, false positives, etc) and must be optimized for large-scale metabolomics.
In order to choose and optimize the most suitable tool for metabolomics, 6 tools available in the literature have been selected and tested with experimental 2D NMR spectra using several criteria (ease of use, format of NMR data, time, number of peaks detected, interactive interface, etc.):
• Deep picker [3]: C/C++ online tool.
• rNMR [4]: R package with an interactive user interface.
• Specmine [5]: R package.
• MVApack [6][7]: GNU octave tool.
• Jason (commercial tool from Jeol)
• PeakViewer: Home-built Matlab program with an interactive interface.
Our analysis was mainly focused on the peak picking step, testing and understanding each type of noise management (using minimal peak height cutoff or Savitzky-Golay filter for example) and peak picking method such as Local Maxima, Deep Learning methods or less common methods such as Wavelet detection.
Throughout this process, we have been able to highlight the simplicity of the Local maxima approach when we compared the time needed for each method to run on small datasets. However, methods based on Deep learning or Wavelet detection seemed to give more accurate results based on the comparison with manual peak picking done by an expert.
This benchmarking is useful to choose the best methods for peak-picking and integration of 2D NMR spectra, and to develop the best tool for these processing steps.
Domains
Computer Science [cs]Origin | Files produced by the author(s) |
---|