TY - JOUR
T1 - Unraveling the Linkages between Molecular Abundance and Stable Carbon Isotope Ratio in Dissolved Organic Matter Using Machine Learning
AU - Yi, Yuanbi
AU - Liu, Tongcun
AU - Merder, Julian
AU - He, Chen
AU - Bao, Hongyan
AU - Li, Penghui
AU - Li, Siliang
AU - Shi, Quan
AU - He, Ding
PY - 2023/11/21
Y1 - 2023/11/21
N2 - Dissolved organic matter (DOM) is a complex mixture of molecules that constitutes one of the largest reservoirs of organic matter on Earth. While stable carbon isotope values (δ13C) provide valuable insights into DOM transformations from land to ocean, it remains unclear how individual molecules respond to changes in DOM properties such as δ13C. To address this, we employed Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) to characterize the molecular composition of DOM in 510 samples from the China Coastal Environments, with 320 samples having δ13C measurements. Utilizing a machine learning model based on 5199 molecular formulas, we predicted δ13C values with a mean absolute error (MAE) of 0.30‰ on the training data set, surpassing traditional linear regression methods (MAE 0.85‰). Our findings suggest that degradation processes, microbial activities, and primary production regulate DOM from rivers to the ocean continuum. Additionally, the machine learning model accurately predicted δ13C values in samples without known δ13C values and in other published data sets, reflecting the δ13C trend along the land to ocean continuum. This study demonstrates the potential of machine learning to capture the complex relationships between DOM composition and bulk parameters, particularly with larger learning data sets and increasing molecular research in the future. © 2023 American Chemical Society.
AB - Dissolved organic matter (DOM) is a complex mixture of molecules that constitutes one of the largest reservoirs of organic matter on Earth. While stable carbon isotope values (δ13C) provide valuable insights into DOM transformations from land to ocean, it remains unclear how individual molecules respond to changes in DOM properties such as δ13C. To address this, we employed Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) to characterize the molecular composition of DOM in 510 samples from the China Coastal Environments, with 320 samples having δ13C measurements. Utilizing a machine learning model based on 5199 molecular formulas, we predicted δ13C values with a mean absolute error (MAE) of 0.30‰ on the training data set, surpassing traditional linear regression methods (MAE 0.85‰). Our findings suggest that degradation processes, microbial activities, and primary production regulate DOM from rivers to the ocean continuum. Additionally, the machine learning model accurately predicted δ13C values in samples without known δ13C values and in other published data sets, reflecting the δ13C trend along the land to ocean continuum. This study demonstrates the potential of machine learning to capture the complex relationships between DOM composition and bulk parameters, particularly with larger learning data sets and increasing molecular research in the future. © 2023 American Chemical Society.
KW - DOM
KW - FT-ICR MS
KW - machine learning
KW - stable carbon isotope
KW - the China Coastal Environments
UR - http://www.scopus.com/inward/record.url?scp=85154043307&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85154043307&origin=recordpage
U2 - 10.1021/acs.est.3c00221
DO - 10.1021/acs.est.3c00221
M3 - RGC 21 - Publication in refereed journal
C2 - 37079797
SN - 0013-936X
VL - 57
SP - 17900
EP - 17909
JO - Environmental Science and Technology
JF - Environmental Science and Technology
IS - 46
ER -