Factories

Factory basics

class rail.projects.factory_mixin.RailFactoryMixin[source]

A Factory can make specific type or types of components, assign names to each, and keep track of what it has made.

This implements:

  1. having a single instance of each sub-class of factory,

  2. having the factory be abble to handle one or more client classes,

  3. creating objects of the sub-classes from yaml,

  4. keeping track of the created object in dictionaries keyed by name,

  5. writing the current content of the factory to a yaml file.

Specific Factories

Factories

Factory Class

Yaml Tag

Example Yaml File

Managed Classes

rail.projects.project_file_factory.RailProjectFileFactory

Files

tests/ci_project_files.yaml

RailProjectFileInstance, RailProjectFileTemplate

rail.projects.catalog_factory.RailCatalogFactory

Catalogs

tests/ci_catalogs.yaml

RailProjectCatalogInstance, RailProjectCatalogTemplate

rail.projects.subsample_factory.RailSubsampleFactory

Subsamples

tests/ci_subsamples.yaml

RailSubsample

rail.projects.selection_factory.RailSelectionFactory

Selections

tests/ci_selections.yaml

RailSelection

rail.projects.algorithm_factory.RailAlgorithmFactory

PZAlgorithms

tests/ci_algorithms.yaml

RailPZAlgorithmHolder

Classifiers

RailClassificationAlgorithmHolder

Summarizers

RailSummarizerAlgorithmHolder

SpecSelections

RailSpecSelectionAlgorithmHolder

ErrorModels

RailErrorModelAlgorithmHolder

Subsamplers

RailSubsamplerAlgorithmHolder

Reducers

RailReducerAlgorithmHolder

rail.projects.pipeline_factory.RailPipelineFactory

Pipelines

tests/ci_pipelines.yaml

RailPipelineTemplate, RailPipelineInstance

rail.plotting.plotter_factory.RailPlotterFactory

Plots

tests/ci_plots.yaml

RailPlotter, RailPlotterList

rail.plotting.dataset_factory.RailDatasetFactory

Data

tests/ci_datasets.yaml

RailDatasetHolder, RailDatasetListHolder, RailProjectHolder

rail.plotting.plot_group_factory.RailPlotGroupFactory

PlotGroups

tests/ci_plot_groups.yaml

RailPlotGroup

class rail.projects.algorithm_factory.RailAlgorithmFactory[source]

Factory class to make holder for Algorithms

Expected usage is that user will define a yaml file with the various datasets that they wish to use with the following example syntax:

SpecSelections:
  - SpecSelection:
      name: zCOSMOS
      Select: SpecSelection_zCOSMOS
      Module: rail.creation.degraders.spectroscopic_selections

PZAlgorithms:
  - PZAlgorithm:
      name: trainz
      Estimate: TrainZEstimator
      Inform: TrainZInformer
      Module: rail.estimation.algos.train_z
  - PZAlgorithm:
      name: simplenn
      Estimate: SklNeurNetEstimator
      Inform: SklNeurNetInformer
      Module: rail.estimation.algos.sklearn_neurnet

and so on.

class rail.projects.catalog_factory.RailCatalogFactory[source]

Factory class to make catalogs

Expected usage is that user will define a yaml file with the various datasets that they wish to use with the following example syntax:

Catalogs:
  - CatalogTemplate
      name: truth
      path_template: "{catalogs_dir}/{project}_{sim_version}/{healpix}/part-0.parquet"
      iteration_vars: ['healpix']
  - CatalogTemplate
      name: reduced
      path_template: "{catalogs_dir}/{project}_{sim_version}_{selection}/{healpix}/part-0.pq"
      iteration_vars: ['healpix']

Or the used can specifiy particular catalog instances where everything except the interation_vars are resolved

Catalogs:
  - CatalogTemplate
      name: truth_roman_rubin_v1.1.3_gold
      path_template: "full_path_to_catalog/{healpix}/part-0.parquet"
      iteration_vars: ['healpix']
class rail.projects.pipeline_factory.RailPipelineFactory[source]

Factory class to make pipelines

Expected usage is that user will define a yaml file with the various datasets that they wish to use with the following example syntax:

Pipelines:
  - PipelineTemplate:
      name: pz:
      pipeline_class: rail.pipelines.estimation.pz_all.PzPipeline
        input_catalog_template: degraded
      output_catalog_template: degraded
      input_file_templates:
        input_train:
          flavor: baseline
          tag: train
        input_test:
          flavor: baseline
          tag: test
      kwargs:
        algorithms: ['all']
class rail.projects.project_file_factory.RailProjectFileFactory[source]

Factory class to make files

Expected usage is that user will define a yaml file with the various datasets that they wish to use with the following example syntax:

Files:
  - FileTemplate:
      name: test_file_100k
      path_template: "{catalogs_dir}/test/{project}_{selection}_baseline_100k.hdf5"

Or the used can specifiy particular file instances where everything except the interation_vars are resolved

Files:
  - FileInstance
      name: test_file_100k_roman_rubin_v1.1.3_gold
      path: <full_path_to_file>
class rail.projects.selection_factory.RailSelectionFactory[source]

Factory class to make selections

Expected usage is that user will define a yaml file with the various datasets that they wish to use with the following example syntax:

Selections:
  - Selection:
      name: maglim_25.5
      cuts:
        maglim_i: [null, 25.5]
class rail.projects.subsample_factory.RailSubsampleFactory[source]

Factory class to make subsamples

Expected usage is that user will define a yaml file with the various datasets that they wish to use with the following example syntax:

Subsamples:
  - Subsample:
    name: test_100k
    seed: 1234
    num_objects: 100000
class rail.plotting.plotter_factory.RailPlotterFactory[source]

Factory class to make plotters

Expected usage is that user will define a yaml file with the various plotters that they wish to use with the following example syntax:

Plots:
  - Plotter:
      name: zestimate_v_ztrue_hist2d
      class_name: rail.plotters.pz_plotters.PZPlotterPointEstimateVsTrueHist2D
      z_min: 0.0
      z_max: 3.0
      n_zbins: 150
- Plotter:
      name: zestimate_v_ztrue_profile
      class_name: rail.plotters.pz_plotters.PZPlotterPointEstimateVsTrueProfile
      z_min: 0.0
      z_max: 3.0
      n_zbins: 60

And group them into lists of plotter that can be run over particular types of data, using the following example syntax:

Plots:
  - PlotterList:
      name: z_estimate_v_z_true
      plotters:
        - zestimate_v_ztrue_hist2d
        - zestimate_v_ztrue_profile
class rail.plotting.dataset_factory.RailDatasetFactory[source]

Factory class to make datasets

Expected usage is that user will define a yaml file with the various datasets that they wish to use with the following example syntax:

Data:
  - Project:
     name: some_project
     yaml_file: /path/to/rail_project_file
  - Dataset:
      name: gold_baseline_test
      class: rail.plotting.project_dataset_holder.RailProjectDatasetHolder
      extractor: rail.plotting.pz_data_extractor.PZPointEstimateDataExtractor
      project: some_project
      selection: gold
      flavor: baseline
      tag: test
      algos: ['all']
  - Dataset:
      name: blend_baseline_test
      class: rail.plotting.project_dataset_holder.RailProjectDatasetHolder
      exctractor: rail.plottings.pz_data_extractor.PZPointEstimateDataExtractor
      project: some_project
      selection: blend
      flavor: baseline
      tag: test
      algos: ['all']

And group them into lists of dataset that can be run over particular types of data, using the following example syntax:

Data:
  - DatasetList:
      name: baseline_test
      datasets:
        - gold_baseline_test
        - blend_baseline_test
class rail.plotting.plot_group_factory.RailPlotGroupFactory[source]

Factory class to make plot_groups

The yaml file should look something like this:

Includes:
  - <path_to_yaml_file_defining_plotter_lists>
  - <path_to_yaml_file defining_dataset_lists>

PlotGroups:
  - PlotGroup:
      name: some_name
      plotter_list_name: nice_plots
      dataset_dict_name: nice_data
  - PlotGroup:
      name: some_other_name
      plotter_list_name: janky_plots
      dataset_dict_name: janky_data