Marine Science Faculty Publications

A Machine Learning Approach to Estimate Surface Ocean PCO2 from Satellite Measurements

Document Type


Publication Date



Surface pCO2, SST, SSS, Chlorophyll, Kd, Satellite remote sensing, Gulf of Mexico

Digital Object Identifier (DOI)


Surface seawater partial pressure of CO2 (pCO2) is a critical parameter in the quantification of air-sea CO2 flux, which further plays an important role in quantifying the global carbon budget and understanding ocean acidification. Yet, the remote estimation of pCO2 in coastal waters (under influences of multiple processes) has been difficult due to complex relationships between environmental variables and surface pCO2. To date there is no unified model to remotely estimate surface pCO2 in oceanic regions that are dominated by different oceanic processes. In our study area, the Gulf of Mexico (GOM), this challenge is addressed through the evaluation of different approaches, including multi-linear regression (MLR), multi-nonlinear regression (MNR), principle component regression (PCR), decision tree, supporting vector machines (SVMs), multilayer perceptron neural network (MPNN), and random forest based regression ensemble (RFRE). After modeling, validation, and extensive tests using independent cruise datasets, the RFRE model proved to be the best approach. The RFRE model was trained using data comprised of extensive pCO2 datasets (collected over 16 years by many groups) and MODIS (Moderate Resolution Imaging Spectroradiometer) estimated sea surface temperature (SST), sea surface salinity (SSS), surface chlorophyll concentration (Chl), and diffuse attenuation of downwelling irradiance (Kd). This RFRE-based pCO2 model allows for the estimation of surface pCO2 from satellites with a spatial resolution of ~1 km. It showed an overall performance of a root mean square difference (RMSD) of 9.1 μatm, with a coefficient of determination (R2) of 0.95, a mean bias (MB) of −0.03 μatm, a mean ratio (MR) of 1.00, an unbiased percentage difference (UPD) of 0.07%, and a mean ratio difference (MRD) of 0.12% for pCO2 ranging between 145 and 550 μatm. The model, with its original parameterization, has been tested with independent datasets collected over the entire GOM, with satisfactory performance in each case (RMSD of ≤~10 μatm for open GOM waters and RMSD of ≤~25 μatm for coastal and river-dominated waters). The sensitivity of the RFRE-based pCO2 model to uncertainties of each input environmental variable was also thoroughly examined. The results showed that all induced uncertainties were close to, or within, the uncertainty of the model itself with higher sensitivity to uncertainties in SST and SSS than to uncertainties in Chl and Kd. The extensive validation, evaluation, and sensitivity analysis indicate the robustness of the RFRE model in estimating surface pCO2 for the range of 145–550 μatm in most GOM waters. The RFRE model approach was applied to the Gulf of Maine (a contrasting oceanic region to GOM), with local model training. The results showed significant improvement over other models suggesting that the RFRE may serve as a robust approach for other regions once sufficient field-measured pCO2 data are available for model training.

Was this content written or created while at USF?


Citation / Publisher Attribution

Remote Sensing of Environment, v. 228, p. 203-226