CBM Standard Import tool format

See Chapter 3 of the Operational-Scale CBM-CFS3-Manual for a detailed description of this format.

Note the SIT format does not define specific names to column headers. It instead interprets column meaning based on column ordering, so any column labels in DataFrames passed to SIT parse functions function will be ignored by the parsing functions here.

Simulating SIT input

class libcbm.input.sit.sit_cbm_factory.EventSort(value)

Controls the order in which SIT events are evaulated within a given timestep.

default_disturbance_type_id = 2

evaluate sit events by timestep, and then the default disturbance type id defined in cbm_defaults database (CBM3 default)

disturbance_type = 1

evaluate sit events by timestep, and then disturbance types in sit_disturbance_types (default)

natural_order = 3

evaluate sit events sorted first by timestep, and then order of appearance of events within the sit_events table

libcbm.input.sit.sit_cbm_factory.create_sit_rule_based_processor(sit: SIT, cbm: CBM, random_func: Callable[[int], Series] | None = None, reset_parameters: bool = True, sit_events: DataFrame | None = None, sit_eligibilities: DataFrame | None = None, sit_transition_rules: DataFrame | None = None, event_sort: EventSort = EventSort.disturbance_type) SITRuleBasedProcessor

initializes a class for processing SIT rule based disturbances.

Parameters:
  • sit (SIT) – sit instance

  • cbm (CBM) – initialized instance of the CBM model

  • random_func (Callable optional) – A function of a single integer that returns a numeric 1d array whose length is the integer argument. Defaults to a numpy implementation.

  • reset_parameters (bool) – if set to true, cbm_vars.parameters.disturbance_type and cbm_vars.parameters.reset_age will be reset prior to computing new disturbances and transition rules.

  • sit_events (pandas.DataFrame, optional) – if specified the returned rule base processor is based on the specified sit_events input. The value will be parsed and validated (sit_classifiers, sit_disturbance_type etc.) based on the values in the specified sit object.

  • sit_eligibilities (pandas.DataFrame, optional) – SIT formatted disturbance eligibilities. Cannot be specified without also specified sit_events using the disturbance-eligibility formatting. Defaults to None.

  • sit_transition_rules (pandas.DataFrame, optional) – if specified the returned rule base processor is based on the specified sit_transition_rules input. The value will be parsed and validated (sit_classifiers, sit_disturbance_type etc.) based on the values in the specified sit object. Note if the sit_disturbance_events parameter is set, but this parameter is not set the transition_rules (if any) attached to the specified sit object will be used by default. If null transition rules are required with non-null sit_events set this parameter to a dataframe with zero rows pandas.DataFrame(). Defaults to None.

  • event_sort (EventSort) – one of the EventSort values, which determines the order in which the supplied sit_events are applied within a given timestep.

Raises:

ValueError – cannot specify sit_eligibilities with no specified sit_events

Returns:

an object for processing SIT rule based

disturbances

Return type:

SITRuleBasedProcessor

libcbm.input.sit.sit_cbm_factory.initialize_cbm(sit: SIT, dll_path=None, parameters_factory: Callable[[], dict] = None) Iterator[CBM]
Create an initialized instance of

libcbm.model.cbm.cbm_model.CBM based on SIT input

Parameters:
  • sit (SIT) – instance of SIT object

  • dll_path (str, optional) – path to the libcbm compiled library, if not specified a default value is used.

  • parameters_factory (func, optional) – a parameterless function that returns parameters for the cbm model. If unspecified the sit default is used. Defaults to None.

Returns:

an initialized CBM instance

Return type:

libcbm.model.cbm.cbm_model.CBM

libcbm.input.sit.sit_cbm_factory.load_sit(config_path: str, db_path: str | None = None, db_locale_code: str = 'en-CA') SIT

Loads data and objects required to run from the SIT format.

Parameters:
  • config_path (str) – path to SIT configuration

  • db_path (str, optional) – path to a cbm_defaults database. If None, the default database is used. Defaults to None.

  • db_locale_code (str, optional) – locale code for the specified db. Defaults to “en-CA”

Returns:

instance of standard import tool object

Return type:

SIT

Reading SIT input

libcbm.input.sit.sit_reader.load_table(config: dict, config_dir: str) DataFrame

Load a table based on the specified configuration. The config_dir is used to compute absolute paths for file based tables.

Supports The following formats:

Excel or CSV:

Uses pandas.read_csv or pandas.read_excel to load and return a pandas.DataFrame. With the exception of the “path” parameter, all parameters are passed as literal keyword args to the pandas.read_csv, or pandas.read_excel function.

Examples:

{"type": "csv"
 "params": {"path": "my_file.csv", "sep": "\t"}

{"type": "excel"
 "params: {"path": "my_file.xls", "header": null}
Parameters:
  • config (dict) – configuration specifying a source of data

  • config_dir (str) – directory containing the configuration

Raises:

NotImplementedError – the name specified for “type” was not a supported data source.

Returns:

the loaded data

Return type:

pandas.DataFrame

libcbm.input.sit.sit_reader.parse(sit_classifiers: DataFrame, sit_disturbance_types: DataFrame, sit_age_classes: DataFrame, sit_inventory: DataFrame | Iterable[DataFrame], sit_yield: DataFrame, sit_events: DataFrame | None = None, sit_transitions: DataFrame | None = None, sit_eligibilities: DataFrame | None = None, sit_parse_options: SITParseOptions | None = None) SITData

Parses and validates CBM Standard import tool formatted data including the complicated interdependencies in the SIT format. Returns an object containing the validated result.

The returned object has the following properties:

  • classifiers: a pandas.DataFrame of classifiers in the sit_classifiers

    input

  • classifier_values: a pandas.DataFrame of the classifier values in the

    sit_classifiers input

  • classifier_aggregates: a dictionary of the classifier aggregates

    in the sit_classifiers input

  • disturbance_types: a pandas.DataFrame based on the disturbance types in

    the sit_disturbance_types input

  • age_classes: a pandas.DataFrame of the age classes based on

    sit_age_classes

  • inventory: a pandas.DataFrame of the inventory based on sit_inventory

  • yield_table: a pandas.DataFrame of the merchantable volume yield curves

    in the sit_yield input

  • disturbance_events: a pandas.DataFrame of the disturbance events based

    on sit_events. If the sit_events parameter is None this field is None.

  • transition_rules: a pandas.DataFrame of the transition rules based on

    sit_transitions. If the sit_transitions parameter is None this field is None.

  • eligibilities: a pandas.DataFrame of the event or transition

    eligibilities based on sit_eligibilities. If the sit_events parameter is None this field is None.

Parameters:
  • sit_classifiers (pandas.DataFrame) – SIT formatted classifiers

  • sit_disturbance_types (pandas.DataFrame) – SIT formatted disturbance types

  • sit_age_classes (pandas.DataFrame) – SIT formatted age classes

  • sit_inventory (pandas.DataFrame) – SIT formatted inventory

  • sit_yield (pandas.DataFrame) – SIT formatted yield curves

  • sit_events (pandas.DataFrame, optional) – SIT formatted disturbance events

  • sit_transitions (pandas.DataFrame, optional) – SIT formatted transition rules. Defaults to None.

  • sit_eligibilities (pandas.DataFrame, optional) – SIT formatted disturbance eligibilities. Defaults to None.

Returns:

an object containing parsed and validated SIT dataset

Return type:

SITData

SIT Mapping

class libcbm.input.sit.sit_mapping.SITMapping(config: dict, sit_cbm_defaults: SITCBMDefaults)
get_default_disturbance_type_id(disturbance_type: Series) Series

Returns a series of default disturbance type ids based on the specified series of SIT disturbance type names and SIT mapping

Parameters:

disturbance_type (pandas.Series) – A series of disturbance types

Raises:
  • KeyError – disturbance type mapped more than one time in SIT mapping

  • KeyError – mapped default disturbance type not found in default data

  • KeyError – sit disturbance type code not mapped to default type

Returns:

a series of disturbance type ids as defined in CBM

default data

Return type:

pandas.Series

get_land_class_id(land_class: Series) Series

Produces a validated series of land class ids.

Parameters:

land_class (pandas.Series) – a series of string land class codes or integer land class ids. If strings are specified the id associated with the name is used, and if ids are specified they are validated and a copy of the input is returned.

Raises:
  • ValueError – at least one of the specified land class codes are not associated with defined land class ids

  • ValueError – at least one of the specified land class ids is not a defined land class ids

Returns:

The series of landclass ids

Return type:

pandas.Series

get_sit_disturbance_type_id(disturbance_type: Series) Series

Gets disturbance type ids based on the specified series of disturbance types, and the SIT mapping. Used to encode any of:

  • historical disturbance type column in sit inventory

  • last pass disturbance type column in sit inventory

  • disturbance type column in sit disturbance events

  • disturbance type column in sit transition rules

Parameters:

disturbance_type (pandas.Series) – A series of disturbance types

Returns:

a series of disturbance type ids

Return type:

pandas.Series

get_spatial_unit(inventory: DataFrame, classifiers: DataFrame, classifier_values: DataFrame) Series

Get a pandas.Series containing spatial unit ids based on SIT Inventory, SIT classifiers and the mapping configuration.

Parameters:
Raises:
  • KeyError – the configuration resulted in an undefined spatial unit id.

  • ValueError – the mapping mode in SIT configuration is not valid

Returns:

the spatial unit ids for the inventory. The number

of values in the series is the same as the number of rows in the specified inventory.

Return type:

pandas.Series

get_species(species: Series, classifiers: DataFrame, classifier_values: DataFrame) Series

Get a series of CBM species ids based on the specified species classifier values series and the SIT import tool classifier configuration and mapping.

Parameters:
Raises:
  • ValueError – a species classifier is mapped more than one time in mapping configuration

  • KeyError – species mapped to an undefined default species name

  • KeyError – a classifier value is not mapped to a default value

  • ValueError – a classifier value was not defined in the classifier/classifier value metadata

  • ValueError – the mapped species is not present in the defined classifiers

Returns:

a series of integer species ids. The length of

the series matches the length of the species parameter.

Return type:

pandas.Series

The SIT Format

libcbm.input.sit.sit_parser.get_parse_bool_func(table_name: str, colname: str) Callable[[Any], bool]

gets a boolean-like value to boolean parse function according to the SIT specification. The parameters are used to create a friendly error message when a parse failure occurs.

Parameters:
  • table_name (str) – Table name to be used in failure error message

  • colname (str) – Column name to be used in failure error message

Returns:

a boolean-like value to bool parse function

Return type:

func

libcbm.input.sit.sit_parser.substitute_using_age_class_rows(rows: DataFrame, parse_bool_func: Callable[[Any], bool], age_classes: DataFrame) DataFrame

Substitute age class criteria values that appear in SIT transition rules or disturbance events data into age values.

Checks that min softwood age equals min hardwood age and max softwood age equals max hardwood age since CBM does not carry separate HW/SW ages.

Parameters:
  • rows (pandas.DataFrame) –

    sit data containing columns that describe age eligibility:

    • using_age_class

    • min_softwood_age

    • min_hardwood_age

    • max_softwood_age

    • max_hardwood_age

  • parse_bool_func (func) – a function that maps boolean-like values to boolean. Passed to the pandas.Series.map function for the using_age_class column.

  • age_classes (pandas.DataFrame) – [description]

Raises:
  • ValueError – values found in the age eligibility columns are not defined identifiers in the specified age classes table.

  • ValueError – hardwood and softwood age criteria were not identical.

Returns:

the input table with age values criteria substituted

for age class criteria.

Return type:

pandas.DataFrame

libcbm.input.sit.sit_parser.unpack_column(table: DataFrame, column_description: dict, table_name: str) DataFrame

Validates a column in a pandas DataFrame

Parameters:
  • table (pandas.DataFrame) – A table containing the column at the index specified by column_description

  • column_description (dict) –

    A dictionary with the following supported keys:

    • name: the name of the column, which is assigned to the column

      label to the result table returned by this function

    • index: the zero based index of the column in the table’s

      ordered columns

    • type: (optional) the column will be converted (if necessary to

      this type) If the conversion is not possible for a value at any row, an error is raised.

    • min_value: (optional) inclusive minimum value constraint.

    • max_value: (optional) inclusive maximum value constraint.

  • table_name (str) – the name of the table being processed, purely for error feedback when an error occurs.

Raises:
  • ValueError – the values in the column were not convertable to the specified column description type

  • ValueError – a min_value or max_value was specified without specifying type in column description

  • ValueError – the min_value or max_value constraint was violated by the value in the column.

Returns:

the resulting table

Return type:

pandas.DataFrame

libcbm.input.sit.sit_parser.unpack_table(table: DataFrame, column_descriptions: list[dict], table_name: str) DataFrame

Validates and assigns column names to a column-ordered table using the specified list of column descriptions. Any existing column labels on the specified table are ignored.

Parameters:
  • table (pandas.DataFrame) – a column ordered table to validate

  • column_descriptions (list) – a list of dictionaries with describing the columns. See unpack_column() for how this is used

  • table_name (str) – the name of the table being processed, purely for error feedback when an error occurs.

Raises:

ValueError – a duplicate column name was detected

Returns:

a type-validated table with columns replaced with

the contents of column_descriptions.

Return type:

pandas.DataFrame

libcbm.input.sit.sit_format.adjust_classifier_names(classifier_names: Series) Series

Make each of the classifier names in the specified series valid python identifiers

Parameters:

classifier_names (pd.Series) – the unadjusted classifier names from the SIT format.

Returns:

adjusted series of classifier names

Return type:

pd.Series

libcbm.input.sit.sit_format.get_age_class_format() list[dict]

Gets a list of dictionaries describing the CBM SIT age class columns

Returns:

a list of dictionaries that describe the CBM SIT age class

columns

Return type:

list

libcbm.input.sit.sit_format.get_age_eligibility_columns(base_index: int) list[dict]

gets the columns for age eligibility which appear in SIT events and SIT transition. The index of the columns is offset using the specified base.

Parameters:

base_index (index) – the index of the first age eligibility columns

Returns:

a list of dictionaries describing the SIT age eligibility

columns

Return type:

list

libcbm.input.sit.sit_format.get_classifier_format(n_columns: int) list[dict]

Gets a list of dictionaries describing the CBM SIT classifier columns

Parameters:

n_columns (int) – the number of columns in an sit classifiers formatted table

Raises:

ValueError – raised if the number of columns is less than the minimum required.

Returns:

a list of dictionaries describing the CBM SIT classifier columns

Return type:

list

libcbm.input.sit.sit_format.get_disturbance_eligibility_columns(index: int) list[dict]

gets the columns for disturbance eligibility which appear in SIT events. The index of the columns is offset using the specified base.

Parameters:

base_index (index) – the index of the first eligibility column

Returns:

a list of dictionaries describing the SIT disturbance

eligibility columns

Return type:

list

libcbm.input.sit.sit_format.get_disturbance_event_format(classifier_names: list[str], n_columns: int, include_eligibility_columns: bool = True, has_disturbance_event_ids: bool = False) list[dict]

Gets a list of column description dictionaries describing the SIT disturbance event format

Parameters:
  • classifier_names (list) – a list of the names of classifiers

  • n_columns (int) – the number of columns in disturbance data. This is required because the format has a varying number of optional columns.

  • include_eligibility_columns (bool, optional) – if set to false the standard age eligibility and carbon eligibility columns are excluded from the result, and an eligibility_id column instead is included.

Raises:

ValueError – specified number of columns is invalid

Returns:

a list of dictionaries describing the SIT disturbance events

table columns

Return type:

list

libcbm.input.sit.sit_format.get_disturbance_type_format(n_columns: int) list[dict]

Gets a list of dictionaries describing the CBM SIT disturbance type columns

Parameters:

n_columns (int) – The number of columns in a SIT disturbance types formatted table.

Raises:
  • ValueError – n_columns is less than the minimum required number of columns for the SIT disturbance type format.

  • ValueError – n_columns is more than the required number of columns for the sit disturbance type format.

Returns:

a list of dictionaries that describe the CBM SIT disturbance

type columns

Return type:

list

libcbm.input.sit.sit_format.get_inventory_format(classifier_names: list[str], n_columns: int, has_inventory_ids: bool) list[dict]

Gets a description of the SIT inventory columns as a list of dictionaries

Parameters:
  • classifier_names (list) – a list of the names of classifiers

  • n_columns (int) – the number of columns in inventory data. This is required because the format has a varying number of optional columns.

  • has_inventory_ids (bool) – if true, the table is expected to have a leading column defining the inventory id for each row for simulation area tracking purposes

Raises:

ValueError – The number of columns was incorrect

Returns:

a list of dictionaries describing the SIT inventory columns

Return type:

list

libcbm.input.sit.sit_format.get_tr_classifier_set_postfix() str

since transition rules contain 2 classifier sets (2 sets of columns) duplicate names are a problem if the classifier names are used for both. This function returns a postfix to append onto the second set of classifiers to solve that issue.

libcbm.input.sit.sit_format.get_transition_rules_format(classifier_names: list[str], n_columns: int, separate_eligibilities: bool = False) list[dict]

Generate a list of dictionaries describing each column in the SIT format transition rules. The format is dynamic and changes based on the number of classifiers and whether or not a spatial identifier is specified.

Parameters:
  • classifier_names (list) – a list of the names of classifiers

  • n_columns (int) – the number of columns in transition rules data. This is used to detect whether or not a spatial identifier is included in the data.

  • separate_eligibilites (bool, Optional) – if set to true, the transition rule format contains a eligbility id column rather than the CBM-CFS3 age and disturbance type elgibility columns.

Raises:

ValueError – n_columns was not valid for the sit transitions format

Returns:

a list of dictionaries describing the SIT transition rule columns

Return type:

list

libcbm.input.sit.sit_format.get_yield_format(classifier_names: list[str], n_columns: int) list[dict]

Gets a list of dictionaries describing the CBM SIT age class columns

Parameters:
  • classifier_names (list) – a list of strings which are the names of the classifiers

  • n_columns (int) – The number of columns in a SIT yield formatted table.

Raises:

ValueError – the specified number of columns is less than the minimum number of columns for a valid SIT yield formatted table

Returns:

a list of dictionaries that describe the CBM SIT yield table

columns

Return type:

list

SIT Configuration

libcbm.input.sit.sit_cbm_config.get_classifiers(classifiers: DataFrame, classifier_values: DataFrame) dict[str, list]

Create classifier input for initializing the CBM class based on CBM Standard import tool formatted data.

Parameters:
Returns:

configuration dictionary for CBM. See:

libcbm.model.cbm.cbm_config.classifier_config()

Return type:

dict

libcbm.input.sit.sit_cbm_config.get_merch_volumes(yield_table: DataFrame, classifiers: DataFrame, classifier_values: DataFrame, age_classes: DataFrame, sit_mapping: SITMapping) list

Create merchantable volume input for initializing the CBM class based on CBM Standard import tool formatted data.

Parameters:
Returns:

configuration for CBM. See:

libcbm.model.cbm.cbm_config

Return type:

list

class libcbm.input.sit.sit_cbm_defaults.SITCBMDefaults(sit_data: SITData, db_path: str, locale_code: str = 'en-CA')

Classifiers

libcbm.input.sit.sit_classifier_parser.get_classifier_keyword() str

gets the _CLASSIFIER keyword using the SIT_Classifiers format.

Returns:

_CLASSIFIER

Return type:

str

libcbm.input.sit.sit_classifier_parser.get_wildcard_keyword() str

Gets the classifier value wildcard keyword of the SIT format

libcbm.input.sit.sit_classifier_parser.parse(classifiers_table: DataFrame) Tuple[DataFrame, list[str], DataFrame, DataFrame]

parse SIT_Classifiers formatted data.

Parameters:

classifiers_table (pandas.DataFrame) – a dataFrame in sit classifiers format.

Raises:

ValueError – duplicated names detected, or other validation error occurred

Example Input:

0

1

2

3

4

1

_CLASSIFIER

classifier1

NaN

NaN

1

a

a

NaN

NaN

1

b

b

NaN

NaN

1

agg1

agg1

a

b

1

agg2

agg2

a

b

2

_CLASSIFIER

classifier2

NaN

NaN

2

a

a

NaN

NaN

2

agg1

agg1

a

NaN

Output based on Example input:

Classifiers:

id

name

1

classifier1

2

classifier2

Classifier Values:

classifier_id

name

description

1

a

a

1

b

b

2

a

a

Classifier Aggregates:

[{'classifier_id': 1,
  'name': 'agg1',
  'description': 'agg2',
  'classifier_values': ['a', 'b']},
 {'classifier_id': 1,
  'name': 'agg2',
  'description': 'agg2',
  'classifier_values': ['a', 'b']},
 {'classifier_id': 2,
  'name': 'agg1',
  'description': 'agg1',
  'classifier_values': ['a']}]
Returns:

  • classifiers - a validated table of classifiers. Classifier

    names may be adjusted so they are valid python identifiers. This entails actions such as replacing spaces with underscore “_”. For the list of original, unadjusted classifier names, see the 2nd item in the returned tuple.

  • original_classifier_labels - the labels as they appear in the

    SIT input data.

  • classifier_values - a validated table of classifier values

  • aggregate_values - a dictionary describing aggregate values

Return type:

tuple

Age Classes

libcbm.input.sit.sit_age_class_parser.generate_sit_age_classes(age_interval: int, max_age: int) DataFrame

generate a valid SIT_ageclass input table. This is a helper method to create input for parse_age_classes()

Parameters:
  • age_interval (int) – the number of years between age classes

  • max_age (int) – the maximum age

Returns:

a table of valid SIT_AgeClasses based on the

parameters

Return type:

pandas.DataFrame

Examples

>>> generate_sit_age_classes(2, 10)
   0  1
0  0  0
1  1  2
2  2  2
3  3  2
4  4  2
5  5  2
libcbm.input.sit.sit_age_class_parser.parse(age_class_table: DataFrame) DataFrame

Parse the sit age class table format into a table of age classes with fields:

  • name

  • class_size

  • start_year

  • end_year

Parameters:

age_class_table (pandas.DataFrame) – a dataframe

Raises:
  • ValueError – the first, and only the first row must have a 0 value

  • ValueError – duplicate values in the first column of the specified table were detected

Example

Input:

0

1

age_0

0

age_1

10

age_2

10

age_3

10

age_4

10

age_5

10

age_6

10

age_7

10

age_8

10

age_9

10

Output:

name

class_size

start_year

end_year

age_0

0

0

0

age_1

10

1

10

age_2

10

11

20

age_3

10

21

30

age_4

10

31

40

age_5

10

41

50

age_6

10

51

60

age_7

10

61

70

age_8

10

71

80

age_9

10

81

90

Returns:

a dataframe describing the age classes.

Return type:

pandas.DataFrame

Disturbance Types

libcbm.input.sit.sit_disturbance_type_parser.parse(disturbance_types_table: DataFrame) DataFrame

Parse and validate a SIT formatted disturbance type table

Parameters:

disturbance_types_table (pandas.DataFrame) – a table in SIT disturbance type format

Example

Input:

0

1

distid1

fire

distid2

clearcut

distid3

clearcut

Output:

id

name

distid1

fire

distid2

clearcut

distid3

clearcut

Raises:

ValueError – duplicate ids detected in disturbance data.

Returns:

a validated copy of the input table with

standardized colmun names

Return type:

pandas.DataFrame

Inventory

libcbm.input.sit.sit_inventory_parser.expand_age_class_inventory(inventory: DataFrame, age_classes: DataFrame) DataFrame

Support for the SIT age class inventory feature. For rows with inventory.using_age_class = True, the inventory.age column represents an identifier defined in the passed age_classes table. The inventory record is divided into one record per year in the associated age class with the full range of ages.

Parameters:
Raises:
  • ValueError – Undefined age class ids found in inventory

  • ValueError – Age class inventory mixed with spatial identifier

Returns:

the age class expanded inventory

Return type:

pandas.DataFrame

libcbm.input.sit.sit_inventory_parser.parse(inventory_table: DataFrame, classifiers: DataFrame, classifier_values: DataFrame, disturbance_types: DataFrame, age_classes: DataFrame, has_inventory_ids: bool = False) DataFrame

Parses and validates SIT formatted inventory data. The inventory_table parameter is the primary data, and the other args act as validation metadata.

Parameters:
  • inventory_table (pandas.DataFrame) – SIT formatted inventory

  • classifiers (pandas.DataFrame) – table of classifier as returned by the function: libcbm.input.sit.sit_classifier_parser.parse()

  • classifier_values (pandas.DataFrame) – table of classifier values as returned by the function: libcbm.input.sit.sit_classifier_parser.parse()

  • disturbance_types (pandas.DataFrame) – table of disturbance types as returned by the function: libcbm.input.sit.sit_disturbance_type_parser.parse()

  • age_classes (pandas.DataFrame) – table of disturbance types as returned by the function: libcbm.input.sit.sit_age_class_parser.parse()

  • has_inventory_ids (bool, optional) – if set to true, in addition to the usually-formatted sit inventory input, a single column representing “inventory_id” is expected to be in the first column position. Note this option is not compatible when the sit inventory “using_age_class” option is activated, and if these are combined a ValueError is raised.

Raises:
  • ValueError – Undefined classifier values detected in inventory table

  • ValueError – Undefined disturbance types detected in inventory table

  • ValueError – has_inventory_ids combined with using_age_class option

Example

Input:

SIT_Inventory:

0

1

2

3

4

5

6

7

8

9

b

a

True

age_2

1

1

1

dist1

dist2

-1

a

a

False

100

1

0

0

dist2

dist1

0

a

a

-1

4

1

0

0

dist1

dist1

-1

classifiers parameter:

id

name

1

classifier1

2

classifier2

classifier_values parameter:

classifier_id

name

description

1

a

a

1

b

b

2

a

a

disturbance_types parameter:

id

name

dist1

fire

dist2

clearcut

dist3

clearcut

age_classes parameter:

name

class_size

start_year

end_year

age_0

0

0

0

age_1

10

1

10

age_2

10

11

20

age_3

10

21

30

age_4

10

31

40

age_5

10

41

50

age_6

10

51

60

age_7

10

61

70

age_8

10

71

80

age_9

10

81

90

land_classes parameter:

land_classes = {0: "lc_1", 1: "lc_2"}

Output: (abbreviated column names)

c1

c2

age

area

delay

lc

hist_dist

last_dist

s_ref

a

a

100

1.0

0

lc_1

fire

fire

0

a

a

4

1.0

0

lc_1

clearcut

clearcut

-1

b

a

11

0.1

1

lc_2

fire

fire

-1

b

a

12

0.1

1

lc_2

fire

fire

-1

b

a

13

0.1

1

lc_2

fire

fire

-1

b

a

14

0.1

1

lc_2

fire

fire

-1

b

a

15

0.1

1

lc_2

fire

fire

-1

b

a

16

0.1

1

lc_2

fire

fire

-1

b

a

17

0.1

1

lc_2

fire

fire

-1

b

a

18

0.1

1

lc_2

fire

fire

-1

b

a

19

0.1

1

lc_2

fire

fire

-1

b

a

20

0.1

1

lc_2

fire

fire

-1

The actual output column names for this example are:

  • classifier1

  • classifier2

  • age

  • area

  • delay

  • land_class

  • historical_disturbance_type

  • last_pass_disturbance_type

  • spatial_reference

Returns:

validated inventory

Return type:

pandas.DataFrame

Growth and Yield

libcbm.input.sit.sit_yield_parser.parse(yield_table: DataFrame, classifiers: DataFrame, classifier_values: DataFrame, age_classes: DataFrame) DataFrame

Parses and validates the CBM SIT growth and yield format.

Parameters:
Raises:
  • ValueError – the specified data did not have the correct number of columns according to the defined classifiers and age classes

  • ValueError – the leading_species column contained a value that was not defined in the specified species map.

  • ValueError – Classifier sets were not valid according to the specified classifiers and classifier_values.

Returns:

Validated sit input with standardized column names

and substituted species

Return type:

pandas.DataFrame

Disturbance Events

libcbm.input.sit.sit_disturbance_event_parser.get_sort_types() dict[int, str]

Gets the CBM standard import tool sorting id/name pairs as a dictionary

libcbm.input.sit.sit_disturbance_event_parser.get_target_types() dict[str, str]

Gets the CBM standard import tool target type id/name pairs as a dictionary

libcbm.input.sit.sit_disturbance_event_parser.parse(disturbance_events: DataFrame, classifiers: DataFrame, classifier_values: DataFrame, classifier_aggregates: DataFrame, disturbance_types: DataFrame, age_classes: DataFrame | None = None, separate_eligibilities: bool = False, has_disturbance_event_ids: bool = False) DataFrame

Parses and validates the CBM SIT disturbance event format, or optionally an extended sit disturbance event format where disturbance eligibilites are separate from sit_events and joined by foreign key.

Parameters:
  • disturbance_events (pandas.DataFrame) – CBM SIT disturbance events formatted data.

  • classifiers (pandas.DataFrame) – used to validate the classifier set columns of the disturbance event data. Use the return value of: libcbm.input.sit.sit_classifier_parser.parse()

  • classifier_values (pandas.DataFrame) – used to validate the classifier set columns of the disturbance event data. Use the return value of: libcbm.input.sit.sit_classifier_parser.parse()

  • classifier_aggregates (pandas.DataFrame) – used to validate the classifier set columns of the disturbance event data. Use the return value of: libcbm.input.sit.sit_classifier_parser.parse()

  • disturbance_types (pandas.DataFrame) – Used to validate the disturbance_type column of the disturbance event data. Use the return value of: libcbm.input.sit.sit_disturbance_types_parser.parse()

  • age_classes (pandas.DataFrame, optional) – used to validate and compute age eligibility criteria in disturbance_events. Use the return value of: libcbm.input.sit.sit_age_class_parser.parse().

  • separate_eligibilities (bool, optional) – indicates, when true, that disturbance event eligibilities are stored in a separate table, and the sit_event format is simplified. When false, the sit_event eligbility columns are as documented in CBM-CFS3.

  • has_disturbance_event_ids (bool, optional) – if set to true, the first column is expected to represent a disturbance event id. This value is used to track last disturbance event id in the model state, and can be used to filter subsequent disturbance events or transition rules to chain specific events.

Raises:
  • ValueError – undefined classifier values were found in the disturbance event classifier sets

  • ValueError – undefined disturbance types were found in the disturbance event disturbance_type column

  • ValueError – undefined sort types were found in the disturbance event sort_type column. See get_sort_types()

  • ValueError – undefined target types were found in the disturbance event target_type column. See get_target_types()

Returns:

the validated disturbance events

Return type:

pandas.DataFrame

Transition Rules

libcbm.input.sit.sit_transition_rule_parser.parse(transition_rules: DataFrame, classifiers: DataFrame, classifier_values: DataFrame, classifier_aggregates: DataFrame, disturbance_types: DataFrame, age_classes: DataFrame, separate_eligibilities: bool = False) DataFrame

Parses and validates the CBM SIT transition rule format.

Parameters:
  • transition_rules (pandas.DataFrame) – CBM SIT transition rule formatted data.

  • classifiers (pandas.DataFrame) – used to validate the classifier set columns of the transition rule data. Use the return value of: libcbm.input.sit.sit_classifier_parser.parse()

  • classifier_values (pandas.DataFrame) – used to validate the classifier set columns of the transition rule data. Use the return value of: libcbm.input.sit.sit_classifier_parser.parse()

  • classifier_aggregates (pandas.DataFrame) – used to validate the classifier set columns of the transition rule data. Use the return value of: libcbm.input.sit.sit_classifier_parser.parse()

  • disturbance_types (pandas.DataFrame) – Used to validate the disturbance_type column of the transition rule data. Use the return value of: libcbm.input.sit.sit_disturbance_types_parser.parse()

  • age_classes (pandas.DataFrame) – used to validate the number of volume columns. Use the return value of: libcbm.input.sit.sit_age_class_parser.parse()

  • separate_eligibilities (bool, Optional) – if set to true, the transition rule format contains a eligbility id column rather than the CBM-CFS3 age and disturbance type elgibility columns.

Raises:
  • ValueError – undefined classifier values were found in the transition rule classifier sets

  • ValueError – a grouped set of transition rules has a percent greater than 100%.

  • ValueError – undefined disturbance types were found in the transition rule disturbance_type column

Returns:

validated transition rules

Return type:

pandas.DataFrame