Module trase.tools.etl.utilities

Functions

def consolidate(df: pandas.core.frame.DataFrame, numerical_columns: Iterable[str], categorical_columns: Optional[Iterable[str]] = None, reset_index=True)

Group a dataframe by one or more categorical columns and sum one or more numerical columns.

Example

df = pd.DataFrame({
    "product": ["eggs", "eggs", "bacon"],
    "value":   [  1.99,  1.99,    5.99],
})
consolidate(df, ["value"], ["product"])
# product   value
#   bacon    5.99
#    eggs    4.98
def dataset_path(data_directory, name) ‑> str
def drop_rows_missing_values(df, *columns)
def hash_iterable(iterable) ‑> str
def hash_string(string) ‑> str
def timing(msg='', threshold=5, indent=0)
def validate(series, validation_function)

Classes

class Loggable

Subclasses

Instance variables

var logger

Methods

def debug(self, *args, **kwargs)
def info(self, *args, **kwargs)
def warning(self, *args, **kwargs)