1 Introduction

Life cycle assessment (LCA) calculates the environmental impact of a product or production process along the entire chain. Input parameters required to describe the production chain can be uncertain due to e.g. temporal variability or unknowns about the true value of emission factors. Uncertainty in the input parameters will cause an uncertainty around the outcome of an LCA. In this paper, uncertainty can refer to variability or epistemic uncertainty (Chen and Corson 2014; Clavreul et al. 2013) of the input parameters. Variability (e.g. natural, temporal, geographical) is inherent to natural systems and cannot be reduced. Epistemic uncertainty refers to unknowns in the system and can be reduced by gaining more knowledge about the system. Analysing this uncertainty can be done by means of a sensitivity analysis and can help to gain more insight into the robustness of the result, to prioritize data collection or to simplify an LCA model. Many LCA studies have been performed over the last decade, and interest in addressing uncertainty propagation is increasing (Groen et al. 2014; Heijungs and Lenzen 2014; Lloyd and Ries 2007). However, few studies apply a systematic sensitivity analysis to address the effect of input uncertainties on the output (Mutel et al. 2013). An explanation might be that ISO 14044 recommends a sensitivity analysis as part of the LCA framework to identify the importance of the input uncertainties, but does not recommend a specific technique.

A sensitivity analysis can be performed by varying an input parameter and, as such, determining the effect on the result. Furthermore, if the distribution function of the input parameters is known, it is possible to calculate the contribution to the output variance. The first approach belongs to the area of local sensitivity analysis. A local sensitivity analysis determines the effect of a (small) change in one of the input parameters at a time. The second approach belongs to the area of global sensitivity analysis. A global sensitivity analysis can be seen as an extension of uncertainty propagation: it determines how much each input parameter contributes to the output variance. While we acknowledge that GSA can be done in several ways, e.g. by density-based methods (Borgonovo and Pliscke 2016), we focus in this article on methods that address the contribution to variance (CTV, see Saltelli et al. 2008) and not e.g. on moment-independent methods (as used by Cucurachi 2014). The main differences between a local and global sensitivity analysis are illustrated in Table 1. In this paper, we focus on global sensitivity analysis, which requires a case study of which the distribution functions of the input parameters are known.

Table 1 Main differences between local and global sensitivity analysis, described by differences in input data requirements and results

In Fig. 1, the procedure of a global sensitivity analysis is illustrated with a schematic LCA model, containing four input parameters. First, the input parameters and their uncertainties are represented by probability density functions (step 1). Second, uncertainty propagation is performed with e.g. Monte Carlo simulation, which propagates uncertainty through the LCA model (step 2) to obtain a distribution function of the output. Third, the variance of the output is calculated (step 3). After the uncertainty propagation is performed, a method for global sensitivity analysis is selected (step 4), which determines how much each input parameter contributes to the output variance (step 5). In the example of Fig. 1, the sensitivity analysis shows that parameter 1 and to a lesser extent parameter 2 are the ones that contribute most to the output variance.

Fig. 1
figure 1

Illustration of global sensitivity analysis in LCA (based on Saltelli et al. 1999)

In LCA literature, five methods for global sensitivity analysis have been mentioned that quantify the contribution to output variance: (1) (squared standardized) regression coefficients, as was suggested by Huijbregts et al. (2001) and applied in LCA by e.g. Aktas and Bilec (2012), Basset-Mens et al. (2009), Sugiyama et al. (2005) and Vigne et al. (2012); (2) (squared) Pearson correlation coefficient (Heijungs and Lenzen 2014; Onat et al. 2014); (3) (squared) Spearman (rank) correlation coefficient (Chen and Corson 2014; Geisler et al. 2005; Heijungs and Lenzen 2014; Mattila et al. 2012; Mattinen et al. 2014; Sonnemann et al. 2003; Wang and Shen 2013); (4) key issue analysis, which applies a first-order Taylor expansion around the LCA model to estimate the output variance, thus avoiding sampling; key issue analysis in LCA has been developed by Heijungs (1996) and applied in LCA by e.g. Heijungs et al. (2005) and Jung et al. (2014); and (5) Fourier amplitude sensitivity test has been applied by de Koning et al. (2010).

Outside the LCA domain, a much wider set of approaches have been developed and applied, such as random balance design and the Sobol’ method, that also quantify the contribution to output variance (Saltelli et al. 2008; Sobol’ 2001; Tarantola et al. 2012; Tarantola et al. 2006). The random balance design is closely related to the Fourier amplitude sensitivity test, but the sample size does not depend on the number of parameters (Saltelli et al. 2008) and is therefore considered to be better suitable for LCA (of which a typical case study contains at least a few hundred input parameters). To our knowledge, random balance design has not yet been applied in LCA. The application of the Sobol’ method in LCA has been limited (see for example Wei et al. 2014) and for a characterization model that can be applied in LCA (Cucurachi 2014). We selected global sensitivity methods that are used most often in LCA (i.e. the squared standardized regression coefficient, squared Spearman (rank) correlation coefficient, key issue analysis). Because these methods were all moment-dependent, we added two moment-dependent methods from outside the LCA field that are currently recommended in the sensitivity analysis field: random balance design (and its corresponding sensitivity index) and the Sobol’ indices (Saltelli et al. 2008), which allowed for a quantitative comparison between the selected methods.

For most of the methods, it is not known under which conditions they perform optimally or if there is a method that performs better than the other methods in LCA. The aim of this study is twofold: (1) to study the applicability of a number of previously suggested methods for global sensitivity analysis to LCA, with a focus on the inventory stage, and (2) to compare the methods based on e.g. their ability to explain the output variance. To be able to compare the performance of global sensitivity methods, two case studies were constructed: one small hypothetical case study describing electricity production that was sensitive to a small change in the input parameters and a large case study describing a production system of a northeast Atlantic fishery.

2 Methods for global sensitivity analysis in LCA

2.1 Sampling procedure with matrix-based LCA

In this paper, we use matrix formulation for LCA (for an explanation, see Heijungs and Suh 2002). A matrix formulation of the LCA model will facilitate the use and discussion of the global sensitivity methods. Matrix-based LCA quantifies the total emissions and resource use (g) of a product over its entire life cycle by:

$$ \mathbf{g}=\mathbf{B}{\mathbf{A}}^{-1}\mathbf{f} $$

The production processes are represented by v = 1 to y columns in the square technology matrix A (size x × y); the rows (u = 1 to x) represent a specific product flow. For example, if electricity is produced in one column, other production processes given in other columns can use it as input. The inventory matrix B (size z × y) consists of use of resources and emissions of (size w = 1 to z) corresponding to each production process. Using the final demand vector f, the production processes are scaled to produce the desired amount. In this paper, we will only consider CO2 emissions from each production process (so z = 1), transforming B into a row vector b (size y). The main LCA equations in this paper is therefore:

$$ g=\mathbf{B}{\mathbf{A}}^{-1}\mathbf{f} $$
(1)

An overview of the symbols introduced in this section can be found in Table 2.

Table 2 Meaning of symbols

Because elements of A and b will be uncertain, we developed general formulas based on a row vector p that contains all elements of A and b. Thus,

$$ {p}_{v+\left(u-1\right)y}={a}_{uv.} $$

and

$$ {p}_{xy+v}={b}_v $$

We may choose to restrict p to contain uncertain elements of A and b only, to save memory. Using this notation, Eq. (1) can be conceived as

$$ \mathrm{g}=\boldsymbol{\upgamma} \left(\boldsymbol{p}\right)\mathbf{f} $$

where γ(p) is a function based on combining the underlying matrices A and b.

All global sensitivity methods applied in this paper, except for key issue analysis, require sampling for uncertainty propagation. In this paper, we used Monte Carlo sampling to generate random numbers and a random balance design to generate equi-distributed numbers, from the distribution functions of the input parameters to generate an output distribution (Fig. 2). The sampling matrix P (size N × k) contains i = 1 to N random numbers drawn for each input parameter j = 1 to k of matrix A and b. For example, Monte Carlo sampling could lead to drawing the following random numbers: 1.04, 0.96 and 0.92 for the first three parameters (Fig. 2). Combining these values and the realizations for the other parameters in Eq. (1) will lead to the first realization of 5.1 kg CO2 (Fig. 2). This procedure is repeated N times; the whole simulation is repeated 50 times.

Fig. 2
figure 2

Monte Carlo sampling approach for matrix-based calculations in LCA. EF emission factor

In this section, six measures, also called sensitivity indices, that quantify the contribution to output variance are introduced. The mathematical notations in the case of matrix-based LCA are given; the full derivation can be found in the Electronic Supplementary Material. All sensitivity methods are programmed in MatLab and are available at evelynegroen.github.io.

Calculating sensitivity indices, there are four aspects that can differ per method. The comparison of the sensitivity methods will be based on these four aspects:

  1. I.

    The sampling design (i.e. how the rows of P are constructed)

  2. II.

    The total output variance

  3. III.

    The total output variance (II) that is explained by the method (this is ideally 100%)

  4. IV.

    The contribution to (III) of the individual input parameters

The relation between the output variance (II), explained variance (III) and the contribution to variance (IV) is visualized in Fig. 3.

Fig. 3
figure 3

Relation between total output variance, explained output variance and contribution to variance by the individual input parameters

In general, the variance of the model output in Eq. (1) is given by the conditional variance of parameter p j and a residual term (or error term):

$$ var(g)=var\left(E\left(g|{p}_j\right)\right)+E\left(var\left(g|{p}_j\right)\right) $$
(2)

The conditional variance var(E(g| p j )) is the “expected reduction in variance that would be obtained if parameter p j  could be fixed” (Saltelli et al. 2010). E is the expected value, and \( var(g)=\frac{1}{N-1}{\sum}_i{\left({g}_i-\overset{-}{g}\right)}^2 \) and \( \overset{-}{g}=\frac{1}{N}{\sum}_i{g}_i \). The variance explained by each of the parameter can be given by the correlation ratio (McKay et al. 1999; Saltelli et al. 2008, Eq. (1.25)):

$$ {S}_j=\frac{var\left(E\left(g|{p}_j\right)\right)\ }{var(g)} $$
(3)

where the ratio S j  is the (main) sensitivity index. A derivation of the sensitivity index can be found in the Electronic Supplementary Material (I), Eqs. (A.1) to (A.4). The expressions for the sensitivity indices for each method are found in the boxed equations in the next subsections.

2.2 Regression- or correlation-based methods for sensitivity analysis

The contribution to variance can be quantified using regression or correlation. First, the general framework of a regression model is introduced. According to the theory of multiple linear regressions, g can be described by

$$ {g}_i={c}_0+{\sum}_{j=1}^k{c}_j{p}_{ij}+{e}_i $$
(4)

where the constant c 0 represents the intercept, c j the slope (or regression coefficient) and e i the error term, which is assumed to be normally distributed with a constant variance. The sensitivity index using the squared standardized regression coefficients (SRCs) is equal to

$$ \boxed{S_j^{SRC}=\frac{{\left(\sum_i\left({p}_{ij}-{\overset{-}{p}}_j\right)\left(\ {g}_i-\overset{-}{g}\right)\right)}^2}{\sum_i{\left({p}_{ij}-{\overset{-}{p}}_j\right)}^2{\sum}_i{\left({g}_i-\overset{-}{g}\right)}^2}=\frac{var\left({p}_j\right)}{var(g)}{\left({c}_j\right)}^2} $$
(5)

where \( var\left({p}_j\right)=\frac{1}{N-1}{\sum}_i{\left({p}_{ij}-\overset{-}{p_j}\right)}^2 \)and \( \overset{-}{p_j}=\frac{1}{N}{\sum}_i{p}_{ij} \). The full description of Eq. (5) is given in the Electronic Supplementary Material (II) in Eqs. (A.5)–(A.9).

The \( {S}_j^{SRC} \), similar to the Pearson correlation coefficient squared, is not robust to outliers (Hamby 1994; Saltelli and Sobol 1995). An alternative to the Pearson correlation coefficient is using its rank-transformed counterpart, in the form of the Spearman rank correlation coefficient. The squared Spearman correlation coefficient (SCC) calculates the linear dependence between the input and output parameters. Each draw of input parameter p ij is rank-transformed to p (i)j , and g i is rank-transformed to g (i). The Spearman correlation coefficient is calculated as follows:

$$ {r}_j^{SCC}=\frac{\sum_i\left({p}_{(i)j}-{\overset{-}{p}}_j\right)\left({g}_{(i)}-\overset{-}{g}\right)}{\sqrt{\sum_i{\left({p}_{(i)j}-{\overset{-}{p}}_j\right)}^2{\sum}_i{\left({g}_{(i)}-\overset{-}{g}\right)}^2}} $$
(6)

The sensitivity index using SCC is equal to

$$ \boxed{S_j^{SCC}={\left({r}_j^{SCC}\right)}^2} $$
(7)

The full description of Eq. (7) is given in the Electronic Supplementary Material in (A.10). In this paper, sensitivity indices based on SRC\( \left({S}_j^{SRC}\right) \) and SCC \( \left({S}_j^{SCC}\right) \) are calculated from the same simulations.

2.3 Key issue analysis using a first-order Taylor expansion

Key issue analysis (KIA) is a method for analytically determining the contribution to variance (or variance decomposition) by means of a first-order Taylor expansion. The first-order Taylor expansion around the central values (\( {\overset{-}{p}}_j \)) of Eq. (1) results in

$$ {g}_j=g\left({p}_j\right)=g\left({\overset{-}{p}}_j\right)+\left(\frac{\partial g\left({\overset{-}{p}}_j\right)}{\partial {p}_j}\right)\left({p}_j-{\overset{-}{p}}_j\right) $$
(8)

Because the total output variance var(g) is estimated by the first-order Taylor expansion, the variance explained by the individual parameters will always be equal to 100% (Fig. 3). The variance according to KIA, therefore, may be of a different magnitude than the output variance obtained by sampling. The sensitivity index using KIA is equal to

$$ \boxed{S_j^{KIA}=\frac{var\left({p}_j\right)}{var(g)}{\left(\frac{\partial g}{\partial {p}_j}\right)}^2} $$
(9)

The full derivation of Eq. (9) is given in the Electronic Supplementary Material (III), in Eqs. (A.11)–(A.13).

2.4 Variance-based methods for sensitivity analysis

2.4.1 Sobol’ indices

In the case of variance-based methods for sensitivity analysis, the variance of Eq. (1) is rewritten as the sum of the variance of all first-order conditional variances and higher-order terms (Electronic Supplementary Material IV, Eqs. (A.14)–(A.15)):

$$ var(g)={\sum}_jvar\left(E\left(g|{p}_j\right)\right)+{\sum}_l{\sum}_{j>l}\left(var\left(E\left(g|{p}_j,{p}_l\right)\right)-var\left(E\left(g|{p}_j\right)\right)-var\left(E\left(g|{p}_l\right)\right)\right)+\dots $$
(10)

To calculate the conditional variances of Eq. (3), we have adopted the sampling algorithm described by (Saltelli et al. 2010). The sampling algorithm fixes one parameter to calculate the variance reduction in the output. The sampling algorithm requires two sampling matrices. In addition to the sampling matrix P, a second sampling matrix Q is generated in the same way, independent of P. From P and Q, a third sampling matrix is derived R, from which column j comes from Q and all other k − 1 columns come from P. For each matrix P, Q and R, the output of the model is calculated using Eq. (1), resulting in g(P), g(Q) and g(R). The variance is calculated through the identity var(g) = E(g 2) − E 2(g). The variance equals

$$ var(g)=\frac{1}{N}{\sum}_i{\left(g{\left(\boldsymbol{P}\right)}_i\right)}^2-{\left(\frac{1}{N}{\sum}_ig{\left(\boldsymbol{P}\right)}_i\right)}^2 $$
(11)

Likewise, the conditional variance is given by

$$ var\left(E\left(g|{p}_j\right)\right)=\frac{1}{N}{\sum}_ig\left(\boldsymbol{Q}\right)\left(g{\left({\boldsymbol{R}}^j\right)}_i-g{\left(\boldsymbol{P}\right)}_i\right) $$
(12)

The sensitivity index, applying Sobol’s main effect (SME) index (Electronic Supplementary Material IV, Eq. (A.17)), is equal to

$$ \boxed{{\displaystyle {S}_j^{SME}=\frac{\frac{1}{N}{\sum}_ig\left(\boldsymbol{Q}\right)\left(g{\left({\boldsymbol{R}}^j\right)}_i-g{\left(\boldsymbol{P}\right)}_i\right)}{\frac{1}{N}{\sum}_i{\left(g{\left(\boldsymbol{P}\right)}_i\right)}^2-{\left(\frac{1}{N}{\sum}_ig{\left(\boldsymbol{P}\right)}_i\right)}^2}}} $$
(13)

The Sobol’ total effect index (STE) calculates how much input parameter j explains of the output variance, including all possible interactions with other parameters:

$$ {S}_j^{STE}={S}_j+{S}_{jl}+{S}_{jm}+\dots +{S}_{jlm}+\dots {S}_{jlm\dots k} $$
(14)

The total effect index equals the “expected variance that would be left if all [parameters] but [parameter p j ] could be fixed” (Saltelli et al. 2010) and is based on the quantification of the residual term in Eq. (2):

$$ E\left(var\left(g|{p}_{\sim j}\right)\right)=\frac{1}{2N}{\sum}_i{\left(g{\left(\boldsymbol{P}\right)}_i-g{\left({\boldsymbol{R}}^j\right)}_i\right)}^2 $$
(15)

where p ~j refers to fixing all parameters except for parameter j. The Sobol’ total effect index (Electronic Supplementary Material IV, Eq. (A.19)) is equal to

$$ \boxed{S_j^{STE}=\frac{\frac{1}{2N}{\sum}_i{\left(g{\left(\boldsymbol{P}\right)}_i-g{\left({\boldsymbol{R}}^j\right)}_i\right)}^2}{\frac{1}{N}{\sum}_i{\left(g{\left(\boldsymbol{P}\right)}_i\right)}^2-{\left(\frac{1}{N}{\sum}_ig{\left(\boldsymbol{P}\right)}_i\right)}^2}} $$
(16)

In the case of an LCA model that behaves approximately linear, all interaction terms (e.g. S jl  and other higher-order terms in Eq. (14)) are approximately zero, so \( {S}_j^{STE}\approx {S}_j^{SME} \), in the case of models containing outliers or interaction terms,\( {S}_j^{STE}>{S}_j^{SME} \). This also means that the sum of the total sensitivity index for an LCA model containing outliers (which could be seen as a non-linear effect) or interaction terms can be larger than 100% (Electronic Supplementary Material IV, Eq. (A.20)–(A.21)).

2.4.2 Random balance design

The theory of Fourier series states that any (periodic) function can be written as a sum of wave functions. Random balance designs (RBDs) calculate the conditional variance by rewriting the LCA model in Eq. (1) in terms of sums of sine and cosine functions. We use complex numbers to facilitate notation of sine and cosine, thus using \( {e}^{\sqrt{-1}\omega } \), where we prefer to write \( \sqrt{-1} \) over i, allowing us to remain using i as an index variable. For this method, we use the discrete Fourier transformation to convert an equally spaced periodic function of size N. The model output of Eq. (1) in terms of Fourier coefficients are given in the Electronic Supplementary Material (V), Eq. (A.24). The Fourier coefficients are given by

$$ g\left({p}_{\omega}\right)=\frac{1}{N}{\sum}_{i=0}^{N-1}g\left({p}_{ij}\right){e}^{-\sqrt{-1}\pi \omega i/N} $$
(17)

where omega (ω) represents the frequency domain, which is divided in equally spaced segments: ω = 1 to N − 1. The parameters that contribute most to the output variance will resemble the wave-like shape of the input parameter. This means that the most sensitive parameters have the highest amplitude and that the amplitude of the wave of the output is a measure of the conditional variance of input parameter j. The total variance is given by

$$ var(g)={\left(\frac{1}{N}{\sum}_{\omega =1}^{N-1}\left|g\left({p}_{\omega}\right)\right|\right)}^2 $$
(18)

A similar expression is found for the conditional variance of each input parameter (Eq. A.28). The sensitivity index using RBD is equal to

$$ \boxed{S_j^{RBD}=\frac{2{\left({\sum}_{\omega =1}^M\left|{g}^j\left({p}_{\omega}\right)\right|\right)}^2}{{\left({\sum}_{\omega =1}^{N-1}\left|g\left({p}_{\omega}\right)\right|\right)}^2}} $$
(19)

where M is equal to the maximum oscillation frequency and g j(p ω ) is the reordered model output for parameter j. The derivation of the sensitivity index in Eq. (19) can be found in the Electronic Supplementary Material (V) and in Xu and Gertner (2011).

2.5 Case studies

A hypothetical case study describing the production of 1 MWh of electricity was selected (the original version of the case study appeared in Heijungs and Suh 2002). The case study consisted of two processes: fuel production and electricity production (Fig. 4). In Fig. 4, parameter 1 equals the electricity production, parameter 2 equals fuel use for electricity production, parameter 3 equals electricity use of fuel production, parameter 4 equals fuel production, parameter 5 equals CO2 emissions during fuel production and parameter 6 equals CO2 emissions during electricity production. The case study is set up in such a way that a small change in one of the input parameters results in a large change of the output.

Fig. 4
figure 4

Case study 1: production of 1 MWh electricity (Groen et al. 2014)

We assumed that the input parameters were log-normally distributed and the relative standard deviation (i.e. coefficient of variation: cv = σ/μ) equalled 5 or 30% for two different scenarios. All input parameters are assumed log-normally distributed to avoid drawing random numbers with an incorrect sign. This is admittedly a weak argument, but our main purpose is to construct a toy example to study the sensitivity indices, not to build a realistic system. We selected a relative small and large coefficient of variation because we wanted to explore if the Sobol’ total sensitivity indices and the Spearman correlation coefficients would explain more of the output variation in case of outliers.

The second case study describes a whitefish fishery in the northeast Atlantic. The functional unit equalled 1 kg landed whitefish. A flow diagram is shown in Fig. 5. Five input parameters we wish to highlight are as follows: parameter α, total amount of landed fish; parameter β, emission factor fuel combustion; parameter γ, fuel production; parameter δ, emission factor fuel production and parameter ε, fuel use. The fishery consists of a single vessel, making trips of approximately 2 weeks, landing their fish in Tromsø, Norway. Data comprised annual averages of the vessel and gear, fuel, lubricants, anti-fouling, detergents, cooling agents and total catch and were collected by the vessel owner. Background data, such as the CO2 emissions during steel production from the vessel, came from the ecoinvent database v2.2 (ecoinvent 2007). In total, 115 input parameters were considered. Also, in this case study, we assumed that all input parameters were log-normally distributed with a cv of 5 or 30%.

Fig. 5
figure 5

Case study 2: production of 1 kg of landed whitefish from the northeast Atlantic (Groen et al. 2014)

3 Results

In this section, we will discuss the sampling design (I), total output variance (II), the explained variance (III) and the contribution to the output variance of the individual input parameters as given by the sensitivity indices S j (IV).

3.1 Sampling design

The differences in uncertainty propagation methods require differences in sample designs and, therefore, in computational effort between methods (Table 3). SRC, SCC and RBD both require N runs, but for the Sobol’ indices (SME and STE) 2N runs are needed to calculate the indices. This means that this method is more computationally demanding than the other sampling methods. Although KIA requires only a single calculation, it does not produce a distribution function of the output, making it more difficult to compare two or more studies. RBD is using the discrete Fourier series, which allowed us to use the fast Fourier transformation algorithm, which is computationally fast (Frigo and Johnson 2005).

Table 3 Sample design and calculation of the output variance for the six sensitivity indices

3.2 Output variance and explained variance

Table 4 shows the mean, total output variance (II) and variance explained by the global sensitivity method (III) of case study 1, in the case of a parameter of dispersion of cv = 5% and cv = 30%, for a sample size of N = 4096 and 50 repetitions. The sample size was chosen to align with Groen et al. (2014). In order to make a proper comparison, we ran an additional Monte Carlo simulation where we calculated the output variance based on N = 106, and we considered this as the best approximation of the output variance.

Table 4 Mean and variance of LCA model output for different sensitivity methods for case study 1 (N = 4096; 50 repetitions), using squared standardized regression coefficients (SRC), squared Spearman correlation coefficients (SCC), key issue analysis (KIA), Sobol’ main effect (SME), Sobol’ total effect (STE), and random balance design (RBD)

For cv = 5%, all methods produced approximately the same mean and output variance for this case study. Variance explained by most methods added up to approximately 100%, suggesting a linear behaviour. For cv = 30%, most methods produced approximately the same mean and output variance. However, KIA estimated the total output variance considerably lower than the sampling-based methods. Furthermore, SRC explained less than SCC, which suggested the presence of outliers. STE also showed a value much larger than 100%, which also suggested the presence of outliers. RBD explained less of the output variance than other methods. Note that the mean value for CO2 is larger when the cv is larger, although the mean value of the input parameters is the same. This is an effect of the asymmetric distribution used. KIA neglects the shape of the distribution and therefore misses this effect.

Table 5 shows the mean and the output variance (II) for case study 2 in the case of a parameter of dispersion of cv = 5% and cv = 30%. Case study 2 contains 115 parameters; the variance explained (III) is shown of the 5 most contributing parameters and for all 115 parameters, because all other parameters contribute <<1%.

Table 5 Mean, output variance and variance explained by 5 or 115 parameters of LCA model output for different sensitivity methods for case study 2 (N = 4096; 50 repetitions), using squared standardized regression coefficients (SRC), squared Spearman correlation coefficients (SCC), key issue analysis (KIA), Sobol’ main effect (SME), Sobol’ total effect (STE), and random balance design (RBD)

For cv = 5%, all methods produced approximately the same mean and output variance. Most methods explained approximately 100% of the variance, suggesting a linear behaviour. For cv = 30%, most methods produced approximately the same mean and output variance, except for KIA. KIA estimated the total output variance considerably lower than the sampling-based method. In the case of RBD, the output explained by 5 or by 115 parameters differed considerably, suggesting that RBD overestimated the sensitivity indices of low contributing parameters.

3.3 Sensitivity index

Figure 6 shows the sensitivity index (IV) of each parameter of case study 1 for a parameter of dispersion of cv = 5% and cv = 30%, scaled to the benchmark output variance computed with N = 106. We scaled the graphs to this benchmark to compensate for methods that predicted a lower output variance. For example, for a cv of 30%, KIA arrived at an output variance of 0.606, which is approximately 20% lower than the output variance with a sample size of N = 106. In the case of applying SRC to case study 1 (cv = 5 %), parameter 1 was responsible for 57% of the output variance. For each method, parameters 1 and 5 contributed most to the output variance; the exact contribution, however, differed per method. Parameters 2, 4 and 6 each have a contribution of approximately 1–6%; parameter 3 contributed less than 0.7% to the output variance. There are some differences in the ranking of parameters 2, 4 and 6 between the various methods.

Fig. 6
figure 6

Contribution to output variance for sensitivity methods applied to case study 1 (N = 4096; 50 repetitions; cv = 5% or cv = 30%) for the sensitivity index (SRC, SCC, KIA, SME, RBD) or total sensitivity index (STE) for each parameter (1–6) is shown. SRC squared standardized regression coefficient; SCC squared Spearman correlation coefficient; KIA key issue analysis; SME Sobol’ main effect index; STE Sobol’ total effect index; RBD random balance design; 1 electricity production; 2 fuel use electricity production; 3 electricity use fuel production; 4 fuel production; 5 CO2 emission fuel production; 6 CO2 emission electricity production

Figure 7 shows the sensitivity index (III) of the five most dominant parameters in the case of a parameter of dispersion of cv = 5% and cv = 30%. The sensitivity indices (III) are shown only for the five most contributing parameters, because all other parameters contribute <<1%. Although for each method, parameters α and β contributed most to the output variance, the exact contribution differed per method. All methods agreed on the much smaller contribution (around 1%) of parameters γ, δ and ε, although there are differences in the precise value, as well as in the ranking.

Fig. 7
figure 7

Contribution to output variance for sensitivity methods applied to study 2 (N = 4096; 50 repetitions; cv = 5% or cv = 30%) for the sensitivity index (SRC, SCC, KIA, SME, RBD) or total sensitivity index (STE) for each parameter (α–ε) is shown. SRC squared standardized regression coefficient; SCC squared Spearman correlation coefficient; KIA key issue analysis; SME Sobol’ main effect index; STE Sobol’ total effect index; RBD random balance design; α catch of whitefish; β emission factor fuel combustion; γ production of fuel; δ emission factor fuel production; ε fuel use

4 Discussion

Table 6 gives an indication of performance of the sensitivity methods, averaged over the two case studies, under conditions of small (cv = 5%) and large input uncertainties (cv = 30%). The evaluation of the sampling design (I) relates to the computational effort of a sensitivity method: KIA does not make use of sampling and was fastest. RBD was faster than SRC and SCC due to the implementation of the fast Fourier transformation algorithm. SME and STE have to generate two sampling matrices and, therefore, were slowest. The evaluation of the sampling design (I) was as follows: (++) independent on the sample size N; (+) using an optimized sample algorithm; (−) dependent on N and (−) dependent on (2× N).

Table 6 Performance of the sensitivity methods averaged over the two case studies on a scale from poor (−), insufficient (−), sufficient (+) to good (++)

The total output variance (II) calculated with each method resulted in approximately the same output variance, except for KIA, which underestimated the output variance especially in the case of high input uncertainties. The evaluation of the output variance was as follows: (++) equalled on average 99–101% of the output variance (compared to the output variance calculated with a sample size of N = 106, which was assumed to be the best estimate); (+) equalled ≥95%; (−) equalled ≥90% and (−) equalled <90% of the best estimate of the output variance.

The variance that is explained by each method (III) is equal to the sum of the sensitivity coefficients (IV). The evaluation of the main sensitivity indices (all indices except STE) was as follows: (++) explained on average 99–101% of the output variance (compared the output variance calculated for each method); (+) explained ≥95%; (−) explained ≥90% and (−) explained <90% of the output variance. The evaluation of the total sensitivity index (STE) is given by the difference with SME. If STE is equal to SME, there is no use for calculating STE: the bigger the difference between the two indices, the more relevant it becomes to calculate the following: (++) STE − SME ≥10%; (+) STE − SME ≥5%; (−) STE − SME <5 % and (−) STE ≈ SME.

There were some limitations to the case studies that were used to evaluate the sensitivity methods. First, all parameters are assumed to be uncorrelated, which is a simplification because of lack of data. When correlations are present, including correlated inputs will increase the accuracy of the outcome of the global sensitivity analysis (Jacques et al. 2006; Xu and Gertner 2008b). A global sensitivity index given by SRC for models with correlated inputs can be found in Xu and Gertner (2008b), given by the Sobol’ indices in Jacques et al. (2006) and given by the RBD and its corresponding indices in Xu and Gertner (2008a).

Second, the performance indicators in Table 6 are based on two case studies with two different sets of input parameters, which is limited. Other types of distribution functions or case studies with more interacting input parameters, for example, were not considered. However, we assumed that not so much the type of distribution function will influence the set of recommended methods, but that it primarily relies on the first moment of the input uncertainty (increasing the chance of outliers and the effect of interactions leading to non-linear behaviour) (Saltelli et al. 2008).

Third, we only considered the inventory stage in this paper. Usually, an LCA includes a midpoint or even an endpoint assessment. In general, the midpoint to inventory calculation step is assumed to be linear, but the inventory to midpoint or midpoint to endpoint relations could be non-linear; in these cases, the Sobol’ method might be preferred because it is better able to include non-linear effects (Iooss and Lemaître 2014; Sobol’ 2001). An illustrative example is given in Cucurachi (2014), where sensitivity indices were quantified in the case of impact assessment of noise on human health, resulting in high values for STE compared to SME, illustrating the benefit of using Sobol’ indices as a measure of global sensitivity.

Fourth, this article started from moment-based approaches, in particular the contribution to variance. An extension of our analysis to moment-independent approaches seems to be a logical next step.

Figure 8 gives an overview of the best performing methods in the case of large (cv = 30 %) or small (cv = 5 %) input uncertainty and a situation being “sensitive” (case study 1) or “non-sensitive” (case study 2) to small changes of the input parameters, where “sensitive” means that a small relative change of an input parameter may produce a large relative change of an output variable. Although our LCA model contains a matrix inverse and is therefore non-linear in its parameters and could potentially be sensitive in this sense (Heijungs 2002), only the first case study displayed sensitivity to small changes. Figure 8 is based on evaluating Tables 4 and 5 similar to how we reached Table 6, but keeping the result of case studies 1 and 2 separate to make a distinction between the sensitive and non-sensitive case studies. The results of the evaluation per case study can be found in the Electronic Supplementary Material (VI). The computational effort was not taken into account in this evaluation, since this indicator seemed less relevant as even the slowest performing methods took less than an hour. Both performance indicators (II: ability to quantify the output variance) and (III: ability to explain the output variance) weighted equally.

Fig. 8
figure 8

Overview of the best performing methods in the case of large (cv = 30%) or small (cv = 5%) input uncertainty and sensitive (case study 1) or not (case study 2) to small changes in the input parameters

When restricted to the assumptions that LCAs are behaved linearly up to the midpoint assessments, the sensitivity indices using SRC or KIA (when input uncertainties are relatively small) or sensitivity indices derived by SCC or the Sobol’ indices (when input uncertainties are large) could be used for global sensitivity analysis (Fig. 8). However, the choice for a global sensitivity method also depends on (1) data availability to the LCA practitioner and (2) the aim of the study. Regarding data availability, if only a parameter of dispersion could be defined and not a probability distribution function, applying KIA is a feasible option, especially when input uncertainties are small, since a distribution function is not required. If probability distribution functions are provided, methods using KIA or SRC (low input uncertainties) and SCC or SME/STE (large input uncertainties) are most feasible. Using Monte Carlo simulation to propagate uncertainty to calculate the sensitivity indices, both SRC and SCC could be calculated as well using the same sample. Likewise, SME and STE are calculated from the same dataset, which could be another reason for selecting the preferred sensitivity analysis method.

Regarding the aim of the study, we will give three examples. First, a goal of an LCA can be to determine whether there is a significant difference between two (or more) scenarios. In that case, sampling-based global sensitivity methods such as SRC or SME can determine which input parameters contribute most to the output variance and therefore should be known most accurately before propagating their uncertainty through the LCA model.

Second, if the goal of an LCA is to assess the performance of a single production system, depending on the nature of the input uncertainties, methods such as KIA but also sampling-based methods such as SCC can help to indicate parameters that could contain opportunities for improvement regarding environmental performance (Heijungs 1994). Third, when repeating an LCA of similar goal and scope, the input parameters that contribute only minor to the output variance according to either of the methods mentioned in Fig. 8 can set to a fixed value to simplify (future) data collection.

5 Conclusions

The aim of this study was twofold: (1) to study the applicability of a number of previously suggested methods for global sensitivity analysis to LCA and (2) to compare the methods based on their ability to explain the output variance, using a number of case studies. Five methods that quantify the contribution to output variance were evaluated: squared standardized regression coefficient (SRC), squared Spearman correlation coefficient (SCC), key issue analysis (KIA), Sobol’ indices (STE and SME) and random balance design index (RBD). Most methods performed approximately equally well for quantifying output variance and contribution to variance of the input parameters, especially for relatively small input uncertainties. In the case of large input uncertainties, methods robust to outliers such as squared Spearman correlation coefficient or the Sobol’ indices performed better than the other methods.

When restricted to the assumptions that quantification of environmental impact in LCAs behaves linearly, squared standardized regression coefficients, squared Spearman correlation coefficients, the Sobol’ indices or key issue analysis can be used for global sensitivity analysis. The choice for one of the methods depends on the available data, the magnitude of the uncertainties of the data and the aim of the study.