High-level description
Thesetup_utilities.py file provides utility functions for setting up and configuring the Cassiopeia preprocessing pipeline. It handles tasks such as parsing configuration files, creating output directories, and defining the sequence of preprocessing stages.
Code Structure
Thesetup_utilities.py file contains three main functions: setup, parse_config, and create_pipeline. The setup function initializes the output directory and logging. The parse_config function reads and parses the configuration file, setting default values and validating required parameters. The create_pipeline function determines the order of preprocessing stages based on the specified entry and exit points.
References
This code references the following symbols:cassiopeia.preprocess.constants.DEFAULT_PIPELINE_PARAMETERS: A dictionary containing default parameters for each stage of the preprocessing pipeline.cassiopeia.mixins.UnspecifiedConfigParameterError: An exception raised when a required configuration parameter is not specified.cassiopeia.preprocess.cassiopeia_preprocess.STAGES: A dictionary mapping stage names to their corresponding functions in the preprocessing pipeline.
Symbols
setup
Description
Sets up the environment for the preprocessing pipeline by creating the output directory and configuring logging.Inputs
| Name | Type | Description |
|---|---|---|
| output_directory_location | str | The path to the output directory. |
| verbose | bool | Whether to enable verbose logging. |
Side Effects
- Creates the output directory if it doesn’t exist.
- Adds a file handler to the logger, writing logs to
preprocess.logandpreprocess.errfiles in the output directory.
parse_config
Description
Parses the configuration file for the preprocessing pipeline, setting default values and validating required parameters.Inputs
| Name | Type | Description |
|---|---|---|
| config_string | str | The configuration file content as a string. |
Outputs
| Name | Type | Description |
|---|---|---|
| parameters | Dict[str, Dict[str, Any]] | A dictionary containing parameters for each preprocessing stage. |
Internal Logic
- Reads the default pipeline parameters from
constants.DEFAULT_PIPELINE_PARAMETERS. - Parses the provided
config_stringusing theconfigparsermodule. - Updates the default parameters with values from the parsed configuration.
- Validates that required parameters are present in the
generalsection of the configuration. - Adds necessary parameters from the
generalsection to other stages. - Returns the parsed and validated parameters.
Error Handling
Raises anUnspecifiedConfigParameterError if any required parameter is missing from the configuration.
create_pipeline
Description
Creates a list of preprocessing stages to be executed based on the specified entry and exit points.Inputs
| Name | Type | Description |
|---|---|---|
| entry | str | The name of the stage to start the pipeline from. |
| _exit | str | The name of the stage to stop the pipeline at. |
| stages | Dict[str, Any] | A dictionary mapping stage names to their corresponding functions. |
Outputs
| Name | Type | Description |
|---|---|---|
| pipeline_stages | List[str] | A list of stage names to be executed in order. |
Internal Logic
- Gets a list of stage names from the
stagesdictionary. - Finds the indices of the
entryand_exitstages in the list. - Returns a slice of the list from the
entryindex to the_exitindex (inclusive).
