Calculating Error Statistics

If the user has configured the model data linkages, the error statistics are displayed when the user RMCs on a sub-item in the Model Analysis menu and selects Calculate Error Statistics . The overall statistics and statistics for each parameter and station, as well as parameter composite statistics, are displayed as shown in Figure 1.  In the Error Statistic Report form displayed in Figure 1, the user can select which statistic parameters are displayed or not. The definitions of these statistics are provided below. The user can also select options to edit the report, save the report to a text file, print the report or copy to the clipboard as shown in box 2 in Figure 1.


Figure 1  Calibration plot generation options.

Extracting the data from the EFDC output can be time-consuming if it is a large model run.  To avoid having to extract the results each time the statistics are generated, the user is advised to uncheck Reload Model Output checkbox in Define Calibration Series window, which will use the extracted data memorized in the computer memory.   

There are twelve pre-defined statistics available for the model-data statistical reports which are described below.  The difference between the average observation and model values (Mean Error - ME) is taken to show the deviation in terms of the unit of measurement.

RMSE: The root mean squared error (or RMSE) reflects sample standard deviation of differences between simulated and observed values. Similarly, the relative RMSE (RRMSE) is the RMSE normalized by the maximum range of observed values.

The Index of Agreement (IOA) is a standardized measure of the degree of model prediction error, with values ranging from 0 (no agreement) to 1 (perfect match).

The R² value is a statistical measure of how close a model’s predictions are to a fitted regression line. It is sometimes referred to as the coefficient of determination.

Three widely used error measures are used here. “O” denotes observations and “P” denotes model predictions at the corresponding locations and times, the mean of the observed and predicted variables for “N” observations at a single or multiple observation stations is given in equations below.

Note that EE uses the same approach for calibration statistics that is does for the time series statistics, such that it generates a set of model/data pairs focusing on the measured data points.  EE uses a linear interpolation approach to generate the model value corresponding to the measured data point in time.  So, if there were 20,000 data points (from 15 minute data for example) which are corresponding to 5,000 model snapshots, EE will generate 20,000 model/data pairs.



Mean Observation

Mean Observation is the mean value of the observed data.

(1)

where N is the number of data pairs between the model predictions and observations.

Mean Prediction:

Mean Prediction is the mean value of the model predicted values.

(2)

Standard Deviation Observed:

(3)

Standard Deviation Predicted:

(4)

Mean Error (ME):

The Mean Error (ME) or Mean Bias Error (MBE) is the difference between the mean values of the predictions and observations. ME has values in (–∞, +∞) and the closer to 0 the better model performance

(5)

Relative Mean Error (RME):

The Relative Mean Error (RME) or Percent Bias is calculated as the Mean Error (ME) divided by the mean of the observed values. Its range is (–∞, +∞) and 0 indicates the best model performance. This indicator is not appropriate for the quantities whose mean values are close to zero, for example, tidal water level

(6)

Mean Absolute Error (MAE):

The Mean Absolute Error (MAE) is calculated as the mean of the absolute error between the model predictions and observations. MAE values are [0, +∞) and the lower value the better model performance.

(7)

Root Mean Square Error (RMSE):

The Root Mean Squared Error (RMSE) reflects the standard deviation of differences (the dispersion) between simulated and observed values. RMSE values are in [0, +∞) and the smaller the RMSE, the better the model performance.

(8)

Relative Root Mean Square Error (RRMSE):

The Relative Root Mean Square Error (RRMSE) is the percentage of the RMSE and the mean observations. Like RMSE, the RRMSE range is [0, +∞) and the lower one indicates the better the model performance. This indicator is not appropriate for near-zero mean constituents.

(9)

Scaled Root Mean Square Error (SRMSE):

The Scaled Root Mean Square Error (SRMSE) is the RMSE normalized by the maximum range of observed values. Similar to RRMSE, its values are [0, +∞) but applicable to near-zero mean variables, such as sea surface elevation.

(10)

Centered Root Mean Square Error (CRMSE):

The Centered Root Mean Square Error (CRMSE) (Taylor, 2001) relates the three statistics including the correlation coefficient between the model predictions and the observations (R), the standard deviation of the model predictions (SDP) and the standard deviation of the observations (SDO). This allows to visually compare the performance of different models using the Taylor Diagram. The CRMSE varies from [0, +∞) and the smaller the CRMSE, the better the model performance.

(11)

Correlation Coefficient (R):

The Correlation Coefficient (R) is a statistical measurement of a relationship between two variables. The correlation coefficient values range is [–1, +1] and its absolute value closer to +1 indicates a stronger correlation. However, a better model performance requires not only the correlation coefficient close to +1 but also the slope of the regression line close to +1 and its intercept close to 0.

(12)

Coefficient of Determination (R²):

The Coefficient of Determination (R²) is the square of the Correlation Coefficient and its meaning is similar to the Correlation Coefficient

(13)

Nash-Sutcliffe Index of Efficiency (NSE):

The Nash-Sutcliffe coefficient of efficiency (NSE) (Nash & Sutcliffe, 1970) varies in (–∞, +1], with higher values indicating better agreement, and NSE = 1 is the optimum value. The NSE ≤ 0 indicates unsatisfactory performance and NSE closer to 1 is considered a better model performance. 

(14)

Coefficient of Efficiency (COE):

The Coefficient of Efficiency (COE) (Legates and McCabe, 1999) varies in (–∞, +1] and the higher the COE, the better the model performance.

(15)

Index of Agreement (IOA):

The Index of Agreement (IOA) (Willmott et al., 1985) is similar to the COE.

(16)

Kling-Gupta Efficiency (KGE):

The Kling-Gupta Efficiency (KGE) (Gupta, et al., 2009) is a model evaluation criterion that can be decomposed into the contribution of mean, variance, and correlation to model performance. KGE ranges in (–∞, +1]. Essentially, the closer to +1, the more accurate the model is.

(17)


References

Chen G.-H. et al. (2012) IEEE Trans Med Imaging 31(4): 907-923

Gupta, H. V., Kling, H., Yilmaz, K. K., & Martinez, G. F. (2009). Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. Journal of Hydrology, 377(1), 80–91. https://doi.org/10.1016/j.jhydrol.2009.08.003

Legates, D. R., & McCabe, G. J. (2013). A refined index of model performance: A rejoinder. International Journal of Climatology, 33(4), 1053–1056. https://doi.org/10.1002/joc.3487

Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner, R. L., Harmel, R. D., & Veith, T. L. (2007). Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations. Transactions of the ASABE, 50(3), 885–900. https://doi.org/10.13031/2013.23153

Taylor, K.E. (2001). Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res. 106: 7183–7192. Bibcode:2001JGR...106.7183T. doi:10.1029/2000JD900719.

Willmott, C. J. (1982). Some Comments on the Evaluation of Model Performance. Bulletin of the American Meteorological Society, 63(11), 1309–1313. https://doi.org/10.1175/1520-0477(1982)063<1309:SCOTEO>2.0.CO;2