Module trase.tools.etl_internal.processors

Classes

class Preprocessor (*args, client=None, fallback_bucket='trase-storage', **kwargs)

Ancestors

Methods

def should_rerun(self, args)

Inherited members

class S3Mixin (*args, client=None, fallback_bucket='trase-storage', **kwargs)

Ancestors

Subclasses

Class variables

var bucket
var version_id

Instance variables

var full_s3_path
var original_extension

Methods

def extract(self, path)
def s3_key(self)
def should_reextract(self, path) ‑> bool

This function checks whether we are able to skip re-downloading the object from S3 by comparing the local SHA256 with that in S3 metadata for the object

class TextPreprocessor (*args, client=None, fallback_bucket='trase-storage', **kwargs)

Ancestors

Methods

def should_rerun(self, args)

Inherited members