[Manual] [Contents] [Prev] [Next]
Model verification or validation accompanies model calibration. The purpose of the validation is to provide information by which a decision maker or model user can assess the probability of the model predicting a future event. An assessment of the uncertainty in the prediction can only be achieved in a qualitative sense because with any complex natural system, absolute predictive capability is impossible. We can not be assured if a model will predict future events. Orenske et al (1994) claimed that when we validate a model we can only determine if it is logically consistent and that such consistency says nothing about the model's predictive abilities. In an absolute sense they are correct but humans every day make judgements about the probability of future events occurring. Thus, the limited sense of validation taken by Orenske et al. needs to be considered in model validation.
Decision making has a strong psychological basis. The dynamics of such judgements have been studied in the contest of risk assessment (see Kahneman, Slovic and Tversky, 1982). This work identified heuristics by which judgements are made under uncertainty. An important assessment technique is designated representativeness. Simply stated, the level of uncertainty in a prediction is assessed in terms of how representative a model system is to the real system. Thus, validation needs to consider the model's representativeness in the sense defined by Kahneman et al. and a model's logical self-consistency as defined by Orenske et al (1994). Each of these aspects has several parts. Self consistency can be judged in terms of the mathematics and the calibration of the model. Representativeness can be judged in terms of intuitive, observational and mechanistic factors (Fig. 66).
Self-consistency
A self-consistent model contains no errors in the mathematical expressions of the assumptions and the constructs relating the assumptions to the output or response. A model that is self consistent is one that follows logically from its assumptions. A test for self consistency does not address whether or not the assumptions themselves are valid; that is addressed in terms of the model's representativeness. Evaluation of self-consistency is in principle straightforward in terms of the model being mathematically correct. In practice validating mathematical consistency of a model can be difficult.
The first measure of self-constancy involves identifying (and eliminating) mathematical errors in formulating model assumptions mathematically and in solving the model equations. Mathematical consistency may also be measured in terms of the asymptotic characteristics of the model. That is, a model that generates physically impossible results as the parameters asymptotically approach limits is less likely to be valid than a model that does not generated unrealistic results. For example, a smolt passage model that predicts greater than 100% survivals at short travel times is less valid than a model that through the nature of its equations generates survival predictions between zero and 100%. Although asymptotically unrealistic models may fit observations within some restricted parameter range, if there are no underlying physical reasons for restricting the range, such models are of dubious utility.
The second measure of self-consistency involves quantifying how well the model fits the calibration data. In the calibration process model parameters are selected through a criterion of goodness-of-fit using a variety of statistical algorithms (Goodness-of-fit section II.2.1). These statistics are quantitative measures of a model's self-consistency.
Model representativeness
The second step in model validation is to determine the representativeness of a model for the real system. Here we must consider the psychological process by which people evaluate uncertainty. Although there is no fixed set of measures there are three general categories to assess the uncertainty in judgements: intuition, ecological theory, and ability to fit observations. A validity assessment will consider all three.
Intuitive validation
A person will make an intuitive evaluation of the validity of a model's predictions. There is no quantitative measure for intuitive validations and they are often influenced by simplified and qualitative anecdotal statements of assumed "truth". Such statements are often for public consumption, may be distorted and expressed as absolutes. In a public forum models are often used to backup the simple "truth" statements. Addressing these statements is difficult at best. One approach is to provide for people hands-on experience with the models so that they realize the simple public statements are inaccurate.
Mechanism validation
The second way to validate a model's representativeness is to consider how representative the model mechanisms or assumptions are of what is known about the system. That is, we can evaluate how closely a model describes the biological and physical elements of the system. At one extreme lie models based on empirical equations in which the parameters have little biological and physical interpretation. At the other extreme are models rich in complexity and biological realism. Models fall along a continuum between empirical and mechanistic formulations. In essence, a models is comprised of a number of submodels that have mechanistic foundations although the submodels themselves may be empirical.
For example, a number of survival models may be described along this continuum. At the empirical end, a model may express fish survival in terms of the rate of mortality, r, and exposure time, t. The equation would be
(168)
It is mechanistic in expressing mortality as a rate per unit time. It is empirical in that the mortality rate, r, must be derived from observations of survival over time. No other factors are required or relevant.
A more mechanistic model might express the rate of mortality in terms of another environmental variable such as temperature,
. This new relationship might be empirical and derived strictly through observations of how predator feeding rate changes with temperature. This leads to a submodel describing how the mortality rate changes with temperature. For example we might have an exponentially increasing predation rate
(169)
where the coefficients a and b are empirical and have no biological meaning.
Additional mechanisms could be added by expressing exposure time in terms of factors that control fish travel time. Again the travel time submodel may be constructed by equations that are in part based on first principles and in part based on empirical fits to observations.
In moving from empirical to mechanistic models we gain validity by including more of the underlying processes that control a system. There is, however, a trade-off: as the model becomes more complex the number of model parameters that must be calibrated is larger and potential range of model responses become greater.
At what level of complexity is model validity greatest? No simple answer exists and it depends on the available data with which to test the model and the types of questions the model must address. In general, a model must contain mechanisms down to the level at which a system is managed and to the level at which data exists.
For example, a smolt passage model that does not explicitly formulate the impact of spill on fish survival is not mechanistically valid for addressing the impact of spill. Conversely, to validate the model's representativeness to observations it must be compared with data on the impacts of spill.
Observation validation
A third and most rigorous way to validation process is by comparing model or submodel predictions to data. Data in model validation is split into two parts; a calibration part with which a model parameter is fit and a validation part that is compared to a prediction. To compare a model to data we must choose a merit function that measures the agreement between the data and predictions. Merit functions involving a single measure may have a classical statistical basis such as the Student's t-test on prediction of observation means. Merit functions may also involve a number of measures or dimensions. In this case a model predicts a number of different but related measures of the system. For these multiple dimension tests there may be no clear method for how to weight the fit of each model dimension to the corresponding observations. This is particularly true when we have only limited observations for each measure.
Comparing validity assessment
Answering the question "how valid is a model?" is difficult and dependent on the context of how a model is used. Generally two approaches are possible: in a cardinal measure of validity we express the validity of a model in terms of some number, typically a goodness-of-fit merit function. Such measures can be based on self-consistency and representativeness of observations. They are of most use when model predictions explore the system within the parameter space of the calibrated model. Models also have the ability to make predictions beyond the observed states of the system, however, and in these cases validity established from observations is insufficient. Validity must then be established in terms of the representativeness of the model mechanisms to the real system. Such mechanism-based measures of validity are expressed in a ordinal scale confined to ranking the validity of a number of models.
Approach in CRiSP
The approach to validating CRiSP1 has applied the principles outlined in the section above. The following sections detail the validation.
The mathematical validity of CRiSP was addressed through a team approach. All mathematical formulations were developed and checked by at least three people. Theoretical aspects were usually developed by one researchers and checked by another. Submodels were coded by the programmers and the code was checked by the scientist who developed the theory. Submodel operation in CRiSP were checked by a third person.
An important aspect of a model validity is the response of the model outside the range in which it is calibrated. Most equations generate realistic results as model parameters are adjusted to high or low values. For example, the survival is confined within 0 and 100% under all ranges of model parameters. The temperature effect on the predation rate is the only parameter that is not confined by the form of the equation. In this case the input range of the temperature coefficient on predation is confined.
Model validity can be assessed in terms of how well the model fits the data used in the calibration. Calibrated submodels are referenced in (Table 60).
One measure of validity is how accurate a model is in terms of its mathematical description of the underlying ecological processes. The CRiSP model defines mechanisms for fish migration and dam passage, which are detailed in the Theory manual. The important mechanisms involve the dam passage routes and associated mortalities, gas bubble disease mortality, mortality from predation, effects of temperature on mortality and gas generation, and effects of fish age, flow and date of release on fish travel time. In CRiSP particular effort was given to developing a self-consistent smolt travel time model that applies a number of goodness-of-fit measures. The details of the model selection and fitting procedure are described in Zabel (1994). The work developed a series of nested models of increasing complexity and used three goodness-of-fit criteria to determine the balance between increasing model complexity and reliability of the predictions.
Potentially important mechanisms that are currently missing from smolt passage models including effects of fish vitality on mortality, effect of density dependence on fish growth, and effect of estuarine conditions on fish survival.
The validation of the CRiSP model to observations involves comparing the individual submodels and the total model to independent data. Submodel validations are contained in specific calibrations of the submodels developed previously in this chapter. Most submodel validations were assessed in terms goodness-of-fit used in the calibration. Total model validations are available by comparing model survivals to field studies in which survival was estimated through several reaches of the river and for several species.
[Manual] [Contents] [Prev] [Next]
Columbia River Salmon Passage Model CRiSP.1.5 Theory, Calibration & Validation Manual
Copyright © 1996, Columbia Basin Research. All rights reserved.
web@cbr.washington.edu