In [1]:
# Import dependencies
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from IPython.display import Markdown, display
from stats_helper import *
def dis_res(x):
    display(Markdown('___\n##### **Result**: \n\n' + x + '\n___'))

Estimating the global mass of Rubisco

In the notebook leaf_mass_estimate.ipynb, we estimated the total mass of leaves at ≈30 Gt dry weight. In this notebook we estimate the total mass of Rubisco based on the total mass of leaves and the fraction of Rubisco out of the total mass of leaves. We rely on a recent meta-anaylsis by Onoda et al., which measured the leaf nitrogen content per leaf mass (Nmass) as well as the fraction of leaf nitrogen found in Rubisco (Nrub/N), for about a hundred different plant species. Here is a sample of the data:

In [2]:
onoda = pd.read_excel('../data/literature_data.xlsx','Onoda')
onoda.head()
Out[2]:
no Dataset Species Family Growth condition Pot/Field Location/treatment GF EveDec Aarea ... Rubisco_area Nrub/N CWarea CWmass Nconc_in_CW Ncw_area Ncw/N CW extraction CW-N method Ref
0 1 Feng et al. (2009) Ageratina adenophora Asteraceae Outdoor Field Mexico H D 12.72 ... NaN NaN NaN NaN NaN 0.110 0.093 SDS Ninhidrin? Feng, Y.L., Lei, Y.B., Wang, R.F., Callaway, R...
1 2 Feng et al. (2009) Ageratina adenophora Asteraceae Outdoor Field China H D 14.91 ... NaN NaN NaN NaN NaN 0.037 0.035 SDS Ninhidrin? Feng, Y.L., Lei, Y.B., Wang, R.F., Callaway, R...
2 3 Feng et al. (2009) Ageratina adenophora Asteraceae Outdoor Field India H D 16.63 ... NaN NaN NaN NaN NaN 0.086 0.064 SDS Ninhidrin? Feng, Y.L., Lei, Y.B., Wang, R.F., Callaway, R...
3 4 Funk et al (2013) Acacia koa Fabaceae Outdoor Field Hawaii (native) W E 14.30 ... NaN NaN NaN NaN NaN 0.086 0.036 SDS Ninhidrin Funk, J.L., Glenwinkel, L.A. & Sack, L. (2013)...
4 5 Funk et al (2013) Dodonaea viscosa Sapindaceae Outdoor Field Hawaii (native) W E 10.90 ... NaN NaN NaN NaN NaN 0.232 0.100 SDS Ninhidrin Funk, J.L., Glenwinkel, L.A. & Sack, L. (2013)...

5 rows × 31 columns

The dataset contains information of species from three different growth forms: woody, and herbaceous C3 plants. For each one of these groups, we calculate mass fraction of rubisco out of the total leaf mass by multiplying the nitrogen content per leaf mass by the fraction of leaf nitrogen in Rubisco. This gives us the total amount of rubisco nitrogen per leaf mass. To convert this result into the mass fraction of rubisco per leaf mass, we use the fact that nitrogen accounts for about a sixth of the mass of rubisco. We plot below the distribution of the mass fraction of rubisco per leaf mass for each growth form:

In [3]:
# Drop data with no Nrub/N and Nmass data
filt_onoda = onoda.loc[onoda[['Nrub/N','Nmass']].dropna().index]

# Calculate the mass fraction of rubisco per leaf mass
filt_onoda['Rub_frac'] = filt_onoda['Nmass']/100*filt_onoda['Nrub/N']*6

# Convert mass fraction to log scale for plotting histograms
filt_onoda['log_rub_frac'] = np.log10(filt_onoda['Rub_frac'])

filt_onoda.loc[filt_onoda['GF'] == 'G','GF'] = 'H'

fig, ax = plt.subplots(nrows=1, ncols=2, sharex=True, sharey=True, figsize=(12, 6))
ticks = [0.25,0.5,1,2,4,8,16]


ax1 = plt.subplot(1,2,1)

ax1.tick_params(labelsize=12)
filt_onoda.loc[filt_onoda['GF'] == 'W','log_rub_frac'].hist(bins=np.log10([1e-100,0.0025,0.005,0.01,0.02,0.04,0.08,0.16,1e100]),color='#234d20',ax=ax1)
plt.vlines(filt_onoda.loc[filt_onoda['GF'] == 'W','log_rub_frac'].mean(),0,30,color='k',linestyles='dashed')
ax1.set_xticks(np.log10([0.0025,0.005,0.01,0.02,0.04,0.08,0.16]))
ax1.set_xticklabels(ticks)
ax1.set_title('Woody plants',fontsize=15,color='#234d20')
plt.grid(False)
ax2 = plt.subplot(1,2,2,sharex=ax1,sharey=ax1)
ax2.tick_params(labelsize=12)
filt_onoda.loc[filt_onoda['GF'] == 'H','log_rub_frac'].hist(bins=np.log10([1e-100,0.0025,0.005,0.01,0.02,0.04,0.08,0.16,1e100]),color='#299E4A',ax=ax2)
plt.vlines(filt_onoda.loc[filt_onoda['GF'] == 'H','log_rub_frac'].mean(),0,30,color='k',linestyles='dashed')
ax2.set_title('C3 Herbaceous plants',fontsize=15,color='#299E4A')

plt.ylim([0,30])
plt.xlim(np.log10([0.002,0.2]))
fig.text(0.5, 0.04, '% mass of rubisco out of leaf mass', ha='center',fontsize=15)
fig.text(0.07, 0.5, 'Number of species', va='center', rotation='vertical',fontsize=15)
plt.grid(False)
plt.savefig('../figures/figure2_20180927.png',dpi=600)
plt.savefig('../figures/figure2_20180927b.pdf',dpi=600)

In addition, we also calculate the characterisic rubisco content in the leaves of C4 plants. We calculate the geometric mean of the nitrogen content per unit leaf mass in C4 species from the the Glopnet database (http://bio.mq.edu.au/~iwright/glopian.htm), and multiply this mean by a the geometric mean of the rubisco content per unit leaf nitrogen from several published sources:

In [4]:
# Load the glopnet database
glop = pd.read_excel('../data/literature_data.xlsx','glopnet_data',skiprows=1)

# Filter only C4 species
mask =  (glop.C3C4=='C4')

# Calculate the mean nitrogen content
mean_N_C4 = 10**(glop.loc[mask,'log Nmass'].mean())

# Load data from the literature
rub_frac_C4 = pd.read_excel('../data/literature_data.xlsx','C4 rubisco content')

rub_frac_C4_mean = gmean(rub_frac_C4['Rubisco N/leaf N ']*6/100*mean_N_C4)
print('Our best estimate for the mass fraction of rubisco out of the total leaf nitrogen in C4 plants in ≈%.1f' %rub_frac_C4_mean, 'g rubisco per gram leaf nitrogen')
Our best estimate for the mass fraction of rubisco out of the total leaf nitrogen in C4 plants in ≈0.8 g rubisco per gram leaf nitrogen

To estimate the mean fraction of Rubisco out of the dry leaf mass, we calculate the geometric mean of the Rubisco per dry leaf mass for woody plants and for C3 and C4 herbs separately. In the notebook 02_leaf_mass_estimate.ipynb we estimated that leaves of C3 herbaceous plants account for about 25% of the total leaf mass and C4 herbs account for ≈9% of the total leaf mass. Therefore, our best estimate for the fraction of Rubisco out of the dry leaf mass we use the weighted average of the values for woody plants and for C3 and C4 herbs.

In [5]:
# Calculate the geometric mean fraction of Rubisco in each growth form
rub_frac_GF_mean = filt_onoda.groupby('GF')['Rub_frac'].apply(lambda x: gmean(x))

# Add data on the mean fraction of the fraction of Rubisco in C4 plants
rub_frac_GF_mean.loc['C4'] =rub_frac_C4_mean/100

# Calculate the average fraction of Rubisco between woody plants and herbs
best_rub_frac = np.average(rub_frac_GF_mean, weights=[0.25,0.66,0.09])
dis_res('Our best estimate for the average fraction of Rubisco out of the total leaf mass is %.1f percent' %(best_rub_frac*100))

Result:

Our best estimate for the average fraction of Rubisco out of the total leaf mass is 2.3 percent


We now use this mean fraction to estimate the total mass of rubisco:

In [6]:
# Our best estimate for the total leaf mass is ≈32 Gt
best_leaf_mass = 32e15

# Calculate the total mass of rubisco
tot_rub_mass = best_rub_frac*best_leaf_mass

dis_res('Our best estimate for the total mass of rubisco is ≈%.1f Gt' %(tot_rub_mass/1e15))

Result:

Our best estimate for the total mass of rubisco is ≈0.7 Gt


Estimating the total mass of marine rubisco

To estimate the total mass of Rubisco proteins in the marine environment, we rely on the estimate made in Bar-On et al. for the total mass of marine producers. Bar-On et al. estimate the total mass of marine producers at ≈1 Gt C. We assume carbon accounts for ≈50% of the dry weight of marine producers, and that proteins also account for ≈50% of the dry weight of marine producers, so we estimate the total mass of proteins in marine producers at ≈1 Gt. To estimate the mass of Rubisco out of the total mass of proteins in marine producers, we use data from several different sources on the fraction of Rubisco out of the total proteome of several different marine producer species. Here is a sample of the data:

In [7]:
marine_proteome_mass = 1e15 #Our estimate for the total mass of proteins in marine producers is 1 Gt

marine_rubisco_frac = pd.read_excel('../data/literature_data.xlsx','marine_rubisco_content')
marine_rubisco_frac.head()
Out[7]:
Species Taxonomic group Method Group Nutrients CO2 Growth rate [d^-1] Rubisco fraction [%] Reference Remarks
0 Thalassiosira weissflogii Diatoms Quantitative Western blot Morel Exponential Ambient 1.2 2.5 https://doi.org/10.1111/nph.12143 Table 1
1 Thalassiosira oceanica Diatoms Quantitative Western blot Morel Exponential Ambient 1.4 2.0 https://doi.org/10.1111/nph.12143 Table 1
2 Skeletonema costatum Diatoms Quantitative Western blot Morel Exponential Ambient 1.5 1.4 https://doi.org/10.1111/nph.12143 Table 1
3 Chaetoceros muelleri Diatoms Quantitative Western blot Morel Exponential Ambient 1.3 3.7 https://doi.org/10.1111/nph.12143 Table 1
4 Phaeodactylum tricornutum Diatoms Quantitative Western blot Morel Exponential Ambient 1.5 3.2 https://doi.org/10.1111/nph.12143 Table 1

We first calculate the geometric mean fraction of Rubisco for each group of phytoplankton:

In [8]:
marine_rub_frac_filt = marine_rubisco_frac[marine_rubisco_frac['CO2'].isin(['Ambient',390,396,380])]
mean_marine_rub_frac = marine_rub_frac_filt.groupby('Taxonomic group').apply(lambda x: gmean(x['Rubisco fraction [%]']))
mean_marine_rub_frac
mean_marine_rub_frac.loc['Haptophyte & Dinoflagellate'] = mean_marine_rub_frac.loc[['Haptophyte', 'Dinoflagellate']].mean()
mean_marine_rub_frac
Out[8]:
Taxonomic group
Chlorophyte                    2.179256
Cyanobacteria                  1.117836
Diatoms                        1.970404
Dinoflagellate                 0.671100
Euglenozoa                     5.600000
Haptophyte                     4.152815
Haptophyte & Dinoflagellate    2.411957
dtype: float64

We rely on data from Bar-On et al. for the biomass of each taxonomic group. We calculate the mean fraction of Rubisco based on the relative biomass of each group.

In [9]:
# Load data on the biomass of different marine autotrophs
phyto_biomass = pd.read_excel('../data/literature_data.xlsx','marine_phytoplankton_biomass',skiprows=1)

# Calculate the biomass fraction of each group of autotrophs that we calculated a mean Rubisco proteome fraction for 
biomass_frac = phyto_biomass.groupby('Rubisco group')['Biomass [Gt C]'].sum()
biomass_frac = biomass_frac/biomass_frac.sum()

# Calculate the average Rubisco proteome fraction using the biomass fraction of each group of autotrophs
best_marine_rubisco_frac = biomass_frac.mul(mean_marine_rub_frac).sum()
dis_res('Our best estimate for the fraction of Rubisco out of the proteome of marine producers is %.0f' %best_marine_rubisco_frac + '%')

Result:

Our best estimate for the fraction of Rubisco out of the proteome of marine producers is 3%


We multiply the total mass of proteins in marine producers by the characteristic fraction of Rubisco out of the total mass of proteins to estimate the total mass of marine Rubisco

In [10]:
tot_marine_rub_mass = marine_proteome_mass*best_marine_rubisco_frac/100
dis_res('Our best estimate for the total mass of marine Rubisco proteins is ≈%.2f Gt' %(tot_marine_rub_mass/1e15))

Result:

Our best estimate for the total mass of marine Rubisco proteins is ≈0.03 Gt


Uncertainty analysis

Terrestrial Rubisco

First, we project the uncertainty associated with our estimate of the total mass fraction of Rubisco out of the total leaf dry mass. Then we combine this uncertainty with our uncertainty associated with our estimate of the total mass of leaves to arrive at our best projection of the uncertainty associated with our estimate of the total mass of Rubisco

In [11]:
rub_frac_CI = mul_CI(filt_onoda['Rub_frac'])
dis_res('Our projection for the uncertainty associated with our estimate for the mass fraction of Rubisco out of the total leaf dry weight is ≈%.1f-fold' %rub_frac_CI)

Result:

Our projection for the uncertainty associated with our estimate for the mass fraction of Rubisco out of the total leaf dry weight is ≈2.5-fold


In the notebook 02_leaf_mass_estimate.ipynb we project an uncertainty of ≈2-fold associated with our estimate for the total mass of leaves. We combine this uncertainty with the uncertainty associated with our estimate of the fraction of Rubisco out of the dry mass of leaves:

In [12]:
rub_mass_CI = CI_prod_prop([rub_frac_CI,2.2])
dis_res('Our projection for the uncertainty associated with our estimate for the total mass of terrestrial Rubisco is ≈%.1f-fold' %rub_mass_CI)

Result:

Our projection for the uncertainty associated with our estimate for the total mass of terrestrial Rubisco is ≈3.4-fold


Marine Rubisco

To project the uncertainty associated with our estimate of the total mass of marine Rubisco, we combine the uncertainties associated with the estimates of the total protein mass of marine autotrophs and the proteome fraction of Rubisco. Bar-On et al. Estimated a total of about 1.3 Gt C of marine autotrophs. This values was compared against estimates based on remote sensing from Antonine et al. and Behrenfeld & Falkowski., who estimate ≈0.3-0.75 Gt C of phytoplankton. We use these three estimates to derive our uncertainty projection for the protein mass of marine autotrophs. The uncertainty of the conversion between carbon mass and protein mass is expected to be lower than the uncertainty associated with the estimate of the biomass of marine autotrophs and therefore we neglect it.

In [13]:
marine_prot_CI = mul_CI([1.3,0.3,0.75])
dis_res('Our projection for the uncertainty associated with our estimate for the total biomass of marine autotrophs is ≈%.1f-fold' %marine_prot_CI)

Result:

Our projection for the uncertainty associated with our estimate for the total biomass of marine autotrophs is ≈2.5-fold


To project the uncertainty associated with our estimate of the proteome fraction of Rubisco in marine autotrophs, we rely on the variability between the values based on which we calculate the mean proteome fraction for each taxonomic group, and propagate the uncertainty to the calculation of the mean fraction of Rubisco out of the proteome of marine autotrophs:

In [14]:
# Calculate the uncertainty of each group of marine autotrophs
mean_marine_rub_frac_CI = marine_rub_frac_filt.groupby('Taxonomic group').apply(lambda x: mul_CI(x['Rubisco fraction [%]']))

# Propagate to the global mean of the proteome fraction of Rubisco
marine_proteome_frac_CI = CI_sum_prop(biomass_frac.loc[mean_marine_rub_frac_CI[biomass_frac.index].dropna().index],mean_marine_rub_frac_CI[biomass_frac.index].dropna())

#marine_proteome_frac_CI = mul_CI(marine_rubisco_frac.groupby('Species').mean()['Mean fraction of rubisco out of the proteome (%)'])
dis_res('Our projection for the uncertainty associated with our estimate for the proteome fraction of Rubisco in marine autotrophs is ≈%.1f-fold' % marine_proteome_frac_CI)

Result:

Our projection for the uncertainty associated with our estimate for the proteome fraction of Rubisco in marine autotrophs is ≈2.4-fold


We combine the two uncertainties to generate our projection for the uncertainty associated with our estimate of the total mass of marine Rubisco:

In [15]:
marine_mass_CI = CI_prod_prop([marine_prot_CI,marine_proteome_frac_CI])
dis_res('Our projection for the uncertainty associated with our estimate for the total mass of marine Rubisco is ≈%.1f-fold' % marine_mass_CI)

Result:

Our projection for the uncertainty associated with our estimate for the total mass of marine Rubisco is ≈3.6-fold