Module trase.tools.sei_pcs.definition
The "definition" of an SEI-PCS model. It provides a quick overview of the datasets the model requires, the "topology" of the model (the auxiliary datasets, nodes, constraints, etc), how their columns relate to one another, and how the model results get exported to CSV.
Functions
def determine_load_order(datasets: Dict[str, Dataset])def e(header: str, flow_attribute: Optional[str] = None)-
Construct an export definition for a column.
This object contains information which is used by the functions which construct the results file and the ingest metadata.
Args
header- the name of the resulting header in the CSV file. This should follow the Trase standard conventions, for example COUNTRY_OF_ORIGIN
flow_attribute- the name of the column in the flow. This must relate to a column in the flow definition. You can refer to linked columns using dot syntax, for example "country.trase_id".
def load_definition_from_module(module: module) ‑> Definitiondef reload_definition_at_path(path_to_definition_py: str) ‑> Definition
Classes
class Column (name: str, type: Type = builtins.str, key: bool = False, link: str = None, value: Optional[Any] = None, conserve: bool = False, validate: Optional[Validation] = None, only_validate_link: bool = False, non_negative: bool = None)-
Define a column of a dataset.
Args
name- the name of the column as it appears in the file.
type- one of int, float, str, bool, List[int], etc.
key- indicates that the column should be considered to be part of the "primary key" of the dataset; in particular, that the values (among all key columns) should be unique.
link- of the form "target_dataset.target_column", indicating that this column should be left-joined on to on the "target_column" column in "target_dataset".
value- a default value that the column should be populated with.
conserve- whether this column should conserve its total sum throughout the model.
validate- a class from the
trase.tools.sei_pcs.validationmodel which performs column-level validation. For examplevalidate=Code(6)will check that every value in the column is a six-digit code. only_validate_link-
by default, if you link a target dataset, all columns of that dataset are added as part of the merge. For large datasets this can significantly increase memory. By setting
only_validate_link=True, only the target column will be added.For example, suppose that we have this definition:
datasets = { "state": Dataset([ Column("name"), Column("code"), ]), "asset": Dataset([ Column("state", link="state.code"), ]), }Then, the "asset" dataset will have the following columns:
state.namestate.code
If, however, we pass
only_validate_link=Trueto the link:datasets = { # ... "asset": Dataset([ Column("state", link="state.code", only_validate_link=True), ]), }then the "asset" dataset will have only one column: the target of the link:
state.code
However, the usual link validation will still occur.
non_negative- add a validation that values are not negative. This defaults to true for numeric types and false otherwise.
Class variables
var LINK_DELIMITERvar conserve : boolvar key : boolvar link : strvar name : strvar non_negative : boolvar only_validate_link : boolvar type : Type-
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
var validate : Optional[Validation]var value : Optional[Any]
Instance variables
var is_present_in_file
Methods
def all_validators(self) ‑> List[Validation]
class c (name: str, type: Type = builtins.str, key: bool = False, link: str = None, value: Optional[Any] = None, conserve: bool = False, validate: Optional[Validation] = None, only_validate_link: bool = False, non_negative: bool = None)-
Define a column of a dataset.
Args
name- the name of the column as it appears in the file.
type- one of int, float, str, bool, List[int], etc.
key- indicates that the column should be considered to be part of the "primary key" of the dataset; in particular, that the values (among all key columns) should be unique.
link- of the form "target_dataset.target_column", indicating that this column should be left-joined on to on the "target_column" column in "target_dataset".
value- a default value that the column should be populated with.
conserve- whether this column should conserve its total sum throughout the model.
validate- a class from the
trase.tools.sei_pcs.validationmodel which performs column-level validation. For examplevalidate=Code(6)will check that every value in the column is a six-digit code. only_validate_link-
by default, if you link a target dataset, all columns of that dataset are added as part of the merge. For large datasets this can significantly increase memory. By setting
only_validate_link=True, only the target column will be added.For example, suppose that we have this definition:
datasets = { "state": Dataset([ Column("name"), Column("code"), ]), "asset": Dataset([ Column("state", link="state.code"), ]), }Then, the "asset" dataset will have the following columns:
state.namestate.code
If, however, we pass
only_validate_link=Trueto the link:datasets = { # ... "asset": Dataset([ Column("state", link="state.code", only_validate_link=True), ]), }then the "asset" dataset will have only one column: the target of the link:
state.code
However, the usual link validation will still occur.
non_negative- add a validation that values are not negative. This defaults to true for numeric types and false otherwise.
Class variables
var LINK_DELIMITERvar conserve : boolvar key : boolvar link : strvar name : strvar non_negative : boolvar only_validate_link : boolvar type : Type-
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
var validate : Optional[Validation]var value : Optional[Any]
Instance variables
var is_present_in_file
Methods
def all_validators(self) ‑> List[Validation]
class Dataset (columns: List[Column])-
Dataset(columns: List[trase.tools.sei_pcs.definition.Column])
Class variables
var columns : List[Column]
class Definition (description: str, commodity_equivalence_group_name: str = '', years: List[int] = <factory>, version: str = '1', country: str = 'unknown_country', commodity: str = 'unknown_commodity', datasets: Dict[str, Dataset] = <factory>, constraints: Dict[str, Dataset] = <factory>, flows: List[Column] = <factory>, flows_export: List[Export] = <factory>)-
Definition(description: str, commodity_equivalence_group_name: str = '', years: List[int] =
, version: str = '1', country: str = 'unknown_country', commodity: str = 'unknown_commodity', datasets: Dict[str, trase.tools.sei_pcs.definition.Dataset] = , constraints: Dict[str, trase.tools.sei_pcs.definition.Dataset] = , flows: List[trase.tools.sei_pcs.definition.Column] = , flows_export: List[trase.tools.sei_pcs.definition.Export] = ) Class variables
var commodity : strvar commodity_equivalence_group_name : strvar constraints : Dict[str, Dataset]var country : strvar datasets : Dict[str, Dataset]var description : strvar flows : List[Column]var flows_export : List[Export]var version : strvar years : List[int]
class Export (header: str, flow_attribute: str)-
Construct an export definition for a column.
This object contains information which is used by the functions which construct the results file and the ingest metadata.
Args
header- the name of the resulting header in the CSV file. This should follow the Trase standard conventions, for example COUNTRY_OF_ORIGIN
flow_attribute- the name of the column in the flow. This must relate to a column in the flow definition. You can refer to linked columns using dot syntax, for example "country.trase_id".
Class variables
var flow_attribute : strvar header : str