ladder.scripts.workflows.BaseWorkflow#
- class ladder.scripts.workflows.BaseWorkflow(anndata, config='cross-condition', verbose=False, random_seed=None)#
Base class for all workflows.
Offers a high-level API that does not require running blocks of code in quick succession, as the process for each dataset is more or less similar. Must not be instantiated and used directly. All parameters given to specific functions throughout the workflow can later be accessed with the same named attribute.
- Parameters:
anndata (
AnnData) – The dataset object to be used throughout the analyses.config (
Literal["cross-condition", "interpretable"], default: “cross-condition”) – Defines the workflow to be used. Affects model structure.verbose (
bool, default: False) – IfTrue, prints progress messages for various methods within the module.random_seed (
int, optional) – If given, seeds the internal modules with the value.
- batch_mapping#
Mapping of batch literals to encodings, only appears if batch key is provided in workflow.
- Type:
- cell_type_label_key#
Optional cell type labels in
obs, required if cell-type specific evaluation is desired.- Type:
str, optional
- config#
The config string provided during construction.
- Type:
Literal["cross-condition", "interpretable"], default: “cross-condition”
- converter#
Low-level converter class for the attached
Dataset. See:func:`~ladder.data.real_data.distrib_datasetfor details.- Type:
- dataset#
Low-level
Datasetobject passed to the model. See:func:`~ladder.data.real_data.distrib_datasetfor details.- Type:
- verbose#
If
True, prints progress messages for various methods within the module.- Type:
bool, optional
- l_mean#
If
batch_keyis provided in workflow, the empirical library size log-mean for each batch (1-D Array-like offloat). A single value otherwise.- Type:
floator array_like
- l_scale#
If
batch_keyis provided in workflow, then the empirical library size log-variance for each batch (1-D Array-like offloat). A single value otherwise.- Type:
floator array_like
- predictive#
Low-level generator to be used for tasks after training.
- Type:
- train_set#
Low-level training
Datasetpassed to the model. See:func:`~ladder.data.real_data.distrib_datasetfor details.- Type:
- test_set#
Low-level test
Datasetpassed to the model. See:func:`~ladder.data.real_data.distrib_datasetfor details.- Type:
- prep_model(factors, batch_key=None, cell_type_label_key=None, minibatch_size=128, model_type='Patches', model_args=None, optim_args=None)#
Prepares the model to be run.
- run_model(max_epochs=1500, convergence_threshold=1e-3, convergence_window=30, classifier_warmup=0, params_save_path=None)#
Runs the model on the attached data object.
- save_model(params_save_path)#
Saves the attached model.
- load_model(params_load_path)#
Loads parameters for the attached model. Needs
prep_model()to be run first.
- plot_loss()#
Simple plotter for loss functions.
- write_embeddings()#
Places the calculated cell embeddings from the trained model under the corresponding
obsmfield.
- evaluate_reconstruction(subset=None, cell_type=None, n_iter=5)#
Evaluates the quality of reconstructions with generative metrics.
- evaluate_separability(factor=None)#
Evaluates the separability of the latent encodings with respect to conditional effects.
Attributes table#
Methods table#
|
Evaluates the quality of reconstructions with generative metrics. |
|
Evaluates the separability of latent embeddings for conditions. |
|
Loads parameters for the attached model. |
|
Simple plotter for loss functions. |
|
Prepares the model to be run. |
|
Runs the model on the attached data object. |
|
Saves the attached model. |
Places the calculated cell embeddings from the trained model under the corresponding |
Attributes#
- BaseWorkflow.METRICS_REG = {'chamfer': 'Chamfer Discrepancy', 'corr': 'Profile Correlation', 'rmse': 'RMSE', 'swd': '2-Sliced Wasserstein'}#
- BaseWorkflow.OPT_CLASS1 = ['SCVI', 'SCANVI']#
- BaseWorkflow.OPT_CLASS2 = ['Patches']#
- BaseWorkflow.OPT_DEFAULTS = {'betas': (0.9, 0.999), 'eps': 0.01, 'gamma': 1, 'lr': 0.001, 'milestones': [10000000000.0]}#
- BaseWorkflow.OPT_LIST = ['optimizer', 'optim_args', 'gamma', 'milestones', 'lr', 'eps', 'betas']#
- BaseWorkflow.SEP_METRICS_REG = {'calc_asw': 'Average Silhouette Width', 'kmeans_ari': 'K-Means ARI', 'kmeans_nmi': 'K-Means NMI', 'knn_error': 'kNN Classifier Accuracy'}#
Methods#
- BaseWorkflow.evaluate_reconstruction(subset=None, cell_type=None, n_iter=5)#
Evaluates the quality of reconstructions with generative metrics.
- Parameters:
subset (
str, optional) – Key fromlevelsto subset cells for a specific condition before evaluating reconstruction.cell_type (
str, optional) – Requirescell_type_label_keyto be defined as attribute. Subset cells to a single type before evaluating reconstruction.n_iter (
int, default: 5) – Number of times to repeat the generative process.
- BaseWorkflow.evaluate_separability(factor=None)#
Evaluates the separability of latent embeddings for conditions.
- Parameters:
factor (
str, optional) – Item listed inBaseWorkflow.factors. If not provided, the metrics will be evaluated on the combinations of factors.
- BaseWorkflow.load_model(params_load_path)#
Loads parameters for the attached model. Needs
prep_model()to be run first.- Parameters:
params_load_path (
str) – Path to find model parameters. Expects only the shared prefix, and not the trailing “_torch.pth” or “_pyro.pth”.
- BaseWorkflow.plot_loss(save_loss_path=None)#
Simple plotter for loss functions.
- Parameters:
save_loss_path (
str, optional) – If provided, saves the figure to the specified location. Requires the full name with extensions (eg. fig.png).
- BaseWorkflow.prep_model(factors, batch_key=None, cell_type_label_key=None, minibatch_size=128, model_type='Patches', model_args=None, optim_args=None)#
Prepares the model to be run.
The choice of model implicitly decides the kind of condition encodings to use, so there is no need to have a separate data preparation.
- Parameters:
batch_key (
str, optional) – Defines the workflow to be used. Affects model structure. Can later be accessed with same named attribute.cell_type_label_key (
str, optional) – Optional cell type labels inobs, required if cell-type specific evaluation is desired.minibatch_size (
int, default: 128) – Size of the minibatch to be provided during training.model_type (
Literal["SCVI", "SCANVI", "Patches"], default: “Patches”) – Specifies the model attached to the current workflow.model_args (
dict) – Model arguments passed to low-level model constructor. Seemodelsfor details.optim_args (
dict) – Optimizer arguments passed to low-level trainer. Seetrainingfor details.
- BaseWorkflow.run_model(max_epochs=1500, convergence_threshold=0.0001, convergence_window=100, classifier_warmup=0, classifier_aggression=0, params_save_path=None)#
Runs the model on the attached data object.
- Parameters:
max_epochs (
int, default: 1500) – Maximum number of epochs to run.convergence_threshold (
float, default: 1e-3) – Minimum improvement required to continue training.convergence_window (
int, default: 30) – Number of epochs to wait until a new minimum is attained.classifier_warmup (
int, default: 0) – Number of epochs to run the classifier before running the entire model.classifier_aggression (
int, default: 0) – Number of epochs the classifier takes independently between jointly trained epochs. Used for Patches.params_save_path (
str, optional) – If provided, saves the model to the specified path.