rail.projects.project module
- class rail.projects.project.RailFlavor(**kwargs)[source]
Bases:
ConfigurableDescription of a single analysis variation
This includes
a name for the variant, used to construct filenames
a ‘catalog_tag’, which identifies format of the data being used, and sets the expected names of columns accordingly
a list of ‘pipelines’ that can be run in this variant
a list of ‘file_aliases’ that can be used to specify the input files used in this variant
a dict of ‘pipeline_overrides’ that modify the behavior of the various pipelines
- Parameters:
kwargs (Any)
-
config_options:
dict[str,StageParameter] = {'catalog_tag': Parameter(tag for catalog being used, type: <class 'str'>, default: None [optional]), 'file_aliases': Parameter(file aliases used, type: <class 'dict'>, default: {} [optional]), 'name': Parameter(Flavor name, type: <class 'str'>, default: None [required]), 'pipeline_overrides': Parameter(file aliases used, type: <class 'dict'>, default: {} [optional]), 'pipelines': Parameter(pipelines being used, type: <class 'list'>, default: ['all'] [optional])}
- class rail.projects.project.RailProject(**kwargs)[source]
Bases:
ConfigurableMain analysis driver class, this collects all the elements needed to run a collection of studies using RAIL.
The key concepts are:
1. analysis ‘Flavors’, which are versions of similar analyses with slightly different parameter settings and/or input files.
ceci ‘Pipelines’, which run blocks of analysis code
A RailProject basically specifies which Pipelines to run under which flavors, and keeps track of the outputs.
RailProject.functionality_help() for more on class functionality
RailProject.configuration_help() for more on class configuration
- Parameters:
kwargs (Any)
- add_flavor(name, **kwargs)[source]
Add a new flavor to the Project
- Return type:
- Parameters:
name (str)
kwargs (Any)
- build_pipelines(flavor='baseline', *, force=False)[source]
Build ceci pipeline configuraiton files for this project
- Return type:
int- Parameters:
flavor (str) – Which analysis flavor to draw from
force (bool) – Force overwriting of existing pipeline files
- Returns:
0 if ok, error code otherwise
- Return type:
int
-
config_options:
dict[str,StageParameter] = {'Baseline': Parameter(Baseline analysis configuration, type: <class 'dict'>, default: None [required]), 'Catalogs': Parameter(Catalog templates to use, type: <class 'list'>, default: ['all'] [optional]), 'Classifiers': Parameter(Tomographic classifiers to use, type: <class 'list'>, default: ['all'] [optional]), 'CommonPaths': Parameter(Paths to shared directories, type: <class 'dict'>, default: {} [required]), 'ErrorModels': Parameter(Photometric ErrorModels to use, type: <class 'list'>, default: ['all'] [optional]), 'Files': Parameter(Catalog templates to use, type: <class 'list'>, default: ['all'] [optional]), 'Flavors': Parameter(Analysis variants, type: <class 'list'>, default: [] [optional]), 'Includes': Parameter(Files to include, type: <class 'list'>, default: [] [optional]), 'IterationVars': Parameter(Iteration variables to use, type: <class 'dict'>, default: {} [optional]), 'Name': Parameter(Project name, type: <class 'str'>, default: None [required]), 'PZAlgorithms': Parameter(p(z) algorithms to use, type: <class 'list'>, default: ['all'] [optional]), 'PathTemplates': Parameter(File path templates, type: <class 'dict'>, default: {} [optional]), 'Pipelines': Parameter(Catalog templates to use, type: <class 'list'>, default: ['all'] [optional]), 'Reducers': Parameter(Data reducers to use, type: <class 'list'>, default: ['all'] [optional]), 'Selections': Parameter(Data selections to use, type: <class 'list'>, default: ['all'] [optional]), 'SpecSelections': Parameter(Spectroscopic selections to use, type: <class 'list'>, default: ['all'] [optional]), 'Subsamplers': Parameter(Data subsamplers to use, type: <class 'list'>, default: ['all'] [optional]), 'Subsamples': Parameter(Subsample defintions to use, type: <class 'list'>, default: ['all'] [optional]), 'Summarizers': Parameter(n(z) summarizers to use, type: <class 'list'>, default: ['all'] [optional])}
- classmethod configuration_help()[source]
Configuring a RailProject
Most of these element come from the shared library of elements, which is accesible from rail.projects.library
- Return type:
None
- classmethod functionality_help()[source]
The main functions that the use will use using are:
- Return type:
None
load_config:
Read a yaml file and create a RailProject
reduce_data:
Make a reduced catalog from an input catalog by applying a selction and trimming unwanted colums. This is run before the analysis pipelines.
subsample_data:
Subsample data from a catalog to make a testing or training file. This is run after catalog level pipelines, but before pipeliens run on indvidudal training/ testing samples
build_pipelines:
Build ceci pipeline yaml files
run_pipeline_single:
Run a pipeline on a single file
run_pipeline_catalog:
Run a pipeline on a catalog of files
- static generate_ceci_command(pipeline_path, config, inputs, output_dir='.', log_dir='.', **kwargs)[source]
Generate a ceci command to run a pipeline
- Return type:
list[str]- Parameters:
pipeline_path (str) – Path to the pipline yaml file
config (str | None) – Path to the pipeline config yaml file
inputs (dict) – Input to the pipeline
output_dir (str, default=".") – Pipeline output directory
log_dir (str, default=".") – Pipeline log directory
**kwargs – These are appended to the command in key=value pairs
- static generate_kwargs_iterable(**iteration_dict)[source]
Generate a list of kwargs dicts from a dict of lists
- Return type:
list[dict]- Parameters:
iteration_dict (Any)
- get_algorithm(algorithm_type, algo_name)[source]
Get an algorithm of a particular type with a specific name
- Return type:
dict[str,str]- Parameters:
algorithm_type (str)
algo_name (str)
- get_algorithms(algorithm_type)[source]
Get all the algorithms of a particular type
- Return type:
dict[str,dict[str,str]]- Parameters:
algorithm_type (str)
- get_catalog(name, **kwargs)[source]
Resolve the path for a particular catalog file
- Return type:
str- Parameters:
name (str)
kwargs (Any)
- get_catalog_files(name, **kwargs)[source]
Resolve the paths for a particular catalog file
- Return type:
list[str]- Parameters:
name (str)
kwargs (Any)
- get_catalogs()[source]
Get the dictionary describing all the types of data catalogs
- Return type:
dict
- get_classifier(name)[source]
Get the information about a particular tomographic bin classification
- Return type:
dict- Parameters:
name (str)
- get_classifiers()[source]
Get the dictionary describing all the tomographic bin classification
- Return type:
dict
- get_common_path(path_key, **kwargs)[source]
Resolve and return a common path using the kwargs as interopolants
- Return type:
str- Parameters:
path_key (str)
kwargs (Any)
- get_error_model(name)[source]
Get the information about a particular photometric error model algorithms
- Return type:
dict- Parameters:
name (str)
- get_error_models()[source]
Get the dictionary describing all the photometric error model algorithms
- Return type:
dict
- get_file(name, **kwargs)[source]
Resolve and return a file using the kwargs as interpolants
- Return type:
str- Parameters:
name (str)
kwargs (Any)
- get_file_for_flavor(flavor, label, **kwargs)[source]
Resolve the file associated to a particular flavor and label
E.g., flavor=baseline and label=train would give the baseline training file
- Return type:
str- Parameters:
flavor (str)
label (str)
kwargs (Any)
- get_file_metadata_for_flavor(flavor, label)[source]
Resolve the metadata associated to a particular flavor and label
E.g., flavor=baseline and label=train would give the baseline training metadata
- Return type:
- Parameters:
flavor (str)
label (str)
- get_files()[source]
Return the dictionary of specific file templates
- Return type:
dict[str,RailProjectFileTemplate]
- get_flavor(name)[source]
Resolve the configuration for a particular analysis flavor variant
- Return type:
- Parameters:
name (str)
- get_flavor_args(flavors)[source]
Get the ‘flavors’ to iterate a particular command over
- Return type:
list[str]- Parameters:
flavors (list[str])
Notes
If the flavor ‘all’ is included in the list of flavors, this will replace the list with all the flavors defined in this project
- get_flavors()[source]
Return the dictionary of analysis flavor variants
- Return type:
dict[str,RailFlavor]
- get_path(path_key, **kwargs)[source]
Resolve and return a path using the kwargs as interopolants
- Return type:
str- Parameters:
path_key (str)
kwargs (Any)
- get_path_templates()[source]
Return the dictionary of templates used to construct paths
- Return type:
dict
- get_pipeline(name)[source]
Get the information about a particular ceci pipeline
- Return type:
- Parameters:
name (str)
- get_pipelines()[source]
Get the dictionary describing all the types of ceci pipelines
- Return type:
dict[str,RailPipelineTemplate]
- get_pzalgorithm(name)[source]
Get the information about a particular PZ estimation algorithm
- Return type:
dict- Parameters:
name (str)
- get_pzalgorithms()[source]
Get the dictionary describing all the PZ estimation algorithms
- Return type:
dict
- get_selection_args(selections)[source]
Get the ‘selections’ to iterate a particular command over
- Return type:
list[str]- Parameters:
selections (list[str])
Notes
If the selection ‘all’ is included in the list of selections, this will replace the list with all the selections defined in this project
- get_selections()[source]
Get the dictionary describing all the selections
- Return type:
dict[str,RailSelection]
- get_spec_selection(name)[source]
Get the information about a particular spectroscopic selection algorithm
- Return type:
dict- Parameters:
name (str)
- get_spec_selections()[source]
Get the dictionary describing all the spectroscopic selection algorithms
- Return type:
dict
- get_subsamples()[source]
Get the dictionary describing all the subsamples
- Return type:
dict[str,RailSubsample]
- get_summarizer(name)[source]
Get the information about a particular NZ summarization algorithms
- Return type:
dict- Parameters:
name (str)
- get_summarizers()[source]
Get the dictionary describing all the NZ summarization algorithms
- Return type:
dict
- static load_config(config_file)[source]
Create and return a RailProject from a yaml config file
- Return type:
- Parameters:
config_file (str)
- make_pipeline_catalog_commands(pipeline_name, flavor, **kwargs)[source]
Build the commands to run pipeline on a catalog
- Return type:
list[tuple[list[list[str]],str]]- Parameters:
pipeline_name (str) – Pipeline in question
flavor (str) – Flavor to apply
**kwargs (Any) – Other interpolants, such as selection
- Returns:
List of pairs of series of commands and potential location for slurm batch file
- Return type:
list[tuple[list[list[str]], str]
- make_pipeline_single_input_command(pipeline_name, flavor, **kwargs)[source]
Build the command to run pipeline on a single file
- Return type:
list[str]- Parameters:
pipeline_name (str) – Pipeline in question
flavor (str) – Flavor to apply
**kwargs (Any) – Other interpolants, such as selection
- Returns:
Tokens in the command line, usable by subprocess.run()
- Return type:
list[str]
- property name: str
-
projects:
dict[str,RailProject] = {}
- reduce_data(catalog_template, output_catalog_template, reducer_class_name, input_selection, selection, dry_run=False, **kwargs)[source]
Reduce some data
- Return type:
list[str]- Parameters:
catalog_template (str) – Tag for the input catalog
output_catalog_template (str) – Which label to apply to output dataset
reducer_class_name (str,) – Name of the class to use for subsampling
input_selection (str,) – Selection to use for the input
selection (str,) – Selection to apply
dry_run (bool) – If true, do not actually run
**kwargs – Used to provide values for additional interpolants.
- Returns:
Paths to output files
- Return type:
list[str]
- run_pipeline_catalog(pipeline_name, run_mode=RunMode.bash, **kwargs)[source]
Run pipeline on a catalog
- Return type:
int- Parameters:
pipeline_name (str) – Pipeline in question
run_mode (execution.RunMode) – How to run the pipeline (e.g., in bash, or in slurm)
**kwargs (Any) – Other interpolants, such as selection
- Returns:
0 for success, error code otherwise
- Return type:
int
- run_pipeline_single(pipeline_name, run_mode=RunMode.bash, **kwargs)[source]
Run pipeline on a single file
- Return type:
int- Parameters:
pipeline_name (str) – Pipeline in question
run_mode (execution.RunMode) – How to run the pipeline (e.g., in bash, or in slurm)
**kwargs (Any) – Other interpolants, such as selection
- Returns:
0 for success, error code otherwise
- Return type:
int
- subsample_data(catalog_template, file_template, subsampler_class_name, subsample_name, dry_run=False, **kwargs)[source]
Subsammple some data
- Return type:
str- Parameters:
catalog_template (str) – Tag for the input catalog
file_template (str) – Which label to apply to output dataset
subsampler_class_name (str,) – Name of the class to use for subsampling
subsample_name (str,) – Name of the subsample to create
dry_run (bool) – If true, do not actually run
**kwargs – Used to provide values for additional interpolants, e.g., flavor, basename, etc…
- Returns:
Path to output file
- Return type:
str
- wrap_pz_model(path, outdir, **kwargs)[source]
Wrap a pz model file for use by Rubin DM software
- Return type:
int- Parameters:
path (str) – Path to the model file
outdir (str) – Directory we are writing to
kwargs (Any)
- Returns:
status
- Return type:
0 for success, error_code otherwise
- write_yaml(yaml_file)[source]
Write this project to a yaml file
- Return type:
None- Parameters:
yaml_file (str)
-
yaml_tag:
str= 'Project'