Experiment#

class mergernet.core.experiment.Experiment[source]#

Bases: object

This class stores all relevant information for an experiment, which can be accessed from anywhere in the application as it’s an implementation of the Singleton pattern.

The experiments are tracked using two values: exp_id and run_id. The first value is a human-readable integer set by user and is related with a task specified in entrypoint file. And the second value is a auto-generated hex token used to track different re-runs of same experiment.

Attributes

exp_id

Identifier of the experiment set by user.

exp_name

'exp_``exp_id``'

gd_exp_path

The path in Google Drive where the artifacts of a experiment run will be uploaded to.

gd_shared_path

The path in Google Drive where the shared artifacts for all experiments are stored, e.g.

local_exp_path

The path in local environment where artifacts (outputs of an experiment, e.g.

local_shared_path

The path in local environment where shared files (e.g.

notes

Notes for current run of this experiment

class Tracker[source]#

Bases: object

Context manager that creates an wandb project

Parameters:
  • config (dict) – The configuration dict

  • job_type (str) – The job type

  • name (str) – The run name

  • tags (list) – The run tags

See also

mergernet.core.experiment.init_wandb

_post_run()[source]#

Post-run tasks automatically performed when run is called, that includes:

  • upload registered artifacts

See also

mergernet.core.experiment.upload_registered_artifacts()

_pre_run()[source]#

Pre-run tasks automatically performed when run is called, that includes:

  • clear log file

  • setup experiment global attributes

See also

mergernet.core.experiment._setup_experiment_attributes()

_setup_experiment_attributes()[source]#

Configure the experiment identifiers and file system to store the files needed (e.g. dataset) and the files generated by the experiment (e.g. models, dataset, logs, predictions)

classmethod autoclean()[source]#
classmethod delete(path: str | Path | List[str] | List[Path])[source]#
classmethod download_file_gd(fname: str, exp_id: int | None = None, shared: bool = False)[source]#

Downloads a file from google drive inside gd_artifact_path

Parameters:
  • fname (str) – The file name.

  • exp_id (int, optional) – The experiement identifier, defaults to current experiment id. Only relevant if shared is False.

  • shared (bool, optional) – When downloading from shared folder.

classmethod finish_wandb()[source]#

Closes the connection with current wandb project

classmethod init_wandb(config: dict = {}, job_type: str | None = None, name: str | None = None, tags: list = [])[source]#

Creates a wandb project

Parameters:
  • config (dict) – The configuration dict

  • job_type (str) – The job type

  • name (str) – The run name

  • tags (list) – The run tags

See also

mergernet.core.experiment.finish_wandb(), mergernet.core.experiment.Tracker()

classmethod register_artifact(fname: str, service: str)[source]#

Register an artifact that will be automatically uploaded to corresponding service at the end of the experiment

Parameters:
  • fname (str) – The artifact filename

  • service (str) – The service name, one of: gdrive, github or both

run()[source]#

Run the experiement defined in the abstract call method executing pre-run and post-run tasks

See also

mergernet.core.experiment._pre_run(), mergernet.core.experiment._post_run()

classmethod upload_file_gd(fname: str, data: Any | None = None)[source]#

Uploads a file to google drive inside gd_artifact_path

Parameters:
  • fname (str) – The file name

  • data (Any, optional) – The content of the file. If not specified, this method will consider the data of the file with same name as fname inside the local_artifact_path folder. If specified, it can be a json serializable python object or the bytes of the file.

classmethod upload_registered_artifacts()[source]#

Uploads all registered artifacts

_downloaded_artifacts: list = []#

Stores all downloaded artifacts using Experiment.download_*

_exp_created: bool = False#

Flags if the experiment was started by Experiment._pre_run method

_registered_artifacts: list = []#

Stores all registered artifacts

exp_id: int = None#

Identifier of the experiment set by user.

exp_name: str = None#

‘exp_``exp_id``’

Type:

The experiment name

gd_exp_path: Path = None#

The path in Google Drive where the artifacts of a experiment run will be uploaded to.

gd_shared_path: Path = None#

The path in Google Drive where the shared artifacts for all experiments are stored, e.g. external datasets.

local_exp_path: Path = None#

The path in local environment where artifacts (outputs of an experiment, e.g. model predictions) are stored.

local_shared_path: Path = None#

The path in local environment where shared files (e.g. dataset) are stored. This path is shared to any run of any experiemnt.

notes: str = None#

Notes for current run of this experiment