Input Data File

The input data file is a text file that contains the following items in the given order.

  1. The SURPH data file identifier
  2. A one-line data description
  3. The number of populations
  4. The number of periods, or intervals
  5. Group covariate definitions (if any)
  6. Individual covariate definitions (if any)
  7. Population names (optional)
  8. The number tagged in each population
  9. The Tag ID flag
  10. A capture history line for each individual

SURPH 1 and SURPH 2.0 Users: Use the stand-alone data conversion routine to convert an input data file from SURPH1 format to SURPH 2.0 format.

SURPH Data File Identifier

The first non-blank line of the input data file must read "Surph2"

Data Description

The data description goes on the 2nd line (after the "Surph2" keyword). This description will appear in all reports based on th e data.

Number of Populations

The number of populations is indicated by the keyword npop or num_populations followed by an integer for the number of populations.

Number of Periods

The number of periods is indicated by the keyword nper or num_periods followed by an integer for the number of periods.

Group Covariate Definitions

If the data contain group covariates, they are defined after the number of periods definition. There is one definition for each group covariate, as follows:

  1. The keyword gcov or group_covariate.
  2. The name, or label, for the group covariate.
  3. Optionally, one of the following keywords. If neither keyword is specified, the covariate is assumed to be time invariant.
  4. The values of the covariate. If time variant, there must be one value for each population and each period; if time invariant, there must be one value for each population.

Individual Covariate Definitions

If the data contain individual covariates, there must be one definition for each individual covariate, as follows:

  1. The keyword icov or individual_covariate.
  2. The name, or label, for the individual covariate.

The value for each individual covariate for each individual is specified in the capture history line .

Population Names

One label per population to be used in reports. If omitted, the names default to numbers 1 through n (n = number of populations ).

Number Tagged

The number tagged is specified by the keyword ntag or num_tagged, followed by a number for each population.

Tag ID Flag

The keyword tagID followed by the keyword absent indicates that there are no tag IDs in the data and each capture history line begins with the capture history; if the tagID keyword is followed by the keyword present, the first fiel d of the capture history line is the tag ID. If the tagID keyword is not used, the tag IDS are assumed to be present.

Capture History Line

The capture history lines are indicated with the keyword capthist or captureHistories, followed by a capture history line for ea ch individual in each population. The individuals must be grouped by populations. For example, if there are two populations speci fied as

number Tagged
250 300

then there must be 550 capture history lines, with the first 250 belonging to the first population, and the remaining 300 belonging to the second population.

Each capture history line is structured as follows:

  1. The tag ID if the tagID flag is set to present. Otherwise, the capture history line begins with the capture history, below.
  2. The capture history.
  3. The individual covariate values. There must be one value for each individual covariate, and they must be specified in the orde r they are defined.

Example SURPH input data file

Below is an example of a SURPH input data file for a study with 4 populations, 3 periods, 3 group covariates, and 2 individual c ovariates.


InputDataFile.gif (29K)

Top of page