Module trase.tools.aws.aws_helpers_cached

Functions

def get_pandas_df_once(key, bucket='trase-storage', version_id=None, client=None, track=True, print_version_id=False, **kwargs) ‑> pandas.core.frame.DataFrame

Load a CSV file on S3 into a Pandas dataframe.

The file will only be downloaded once: thereafter it is stored in the local cache using the joblib library. The cache key includes the ETag of the object, so it will be up-to-date even if the remote object changes content.

All other arguments are passed to get_pandas_df().