Model_A API¶
Package iabm¶
Reusable package components for Model_A industrial-state identification.
- class iabm.CrossValidationResult(scores)[source]¶
Bases:
objectSummarize fold-wise validation scores produced by the classifier API.
Exposing the result as a dataclass makes downstream reporting clearer and keeps aggregate statistics close to the original fold-level scores.
- Parameters:
scores (ndarray)
- scores: ndarray¶
- property mean: float¶
Return the average score across all folds.
- property std: float¶
Return the standard deviation across all folds.
- class iabm.InferenceDataset(features, active_mask, source_frame)[source]¶
Bases:
objectBundle inference-ready features together with activity bookkeeping.
The source frame and activity mask allow the CLI to reconstruct outputs aligned with the original timestamps, including optional inactive rows.
- Parameters:
features (DataFrame)
active_mask (Series)
source_frame (DataFrame)
- features: DataFrame¶
- active_mask: Series¶
- source_frame: DataFrame¶
- class iabm.IndustrialDataProcessor(analog_path, digital_path=None, *, threshold=50.0, feature_columns=None)[source]¶
Bases:
objectPrepare industrial analog and digital signals for Model_A workflows.
The processor encapsulates the study-specific preprocessing rules so the rest of the package can work with clean training and inference datasets through a stable, object-oriented interface.
- Parameters:
analog_path (str)
digital_path (Optional[str])
threshold (float)
feature_columns (Optional[Sequence[str]])
- DEFAULT_FEATURE_COLUMNS = ['Vrms1', 'Vrms2', 'Vrms3', 'Irms1', 'Irms2', 'Irms3', 'PF1', 'PF2', 'PF3']¶
- POWER_COLUMNS = ['RP1', 'RP2', 'RP3']¶
- THREE_PHASE_BLOCKS = [['Vrms1', 'Vrms2', 'Vrms3'], ['RP1', 'RP2', 'RP3'], ['Irms1', 'Irms2', 'Irms3'], ['PF1', 'PF2', 'PF3']]¶
- SINGLE_PHASE_BLOCKS = [['Vrms4'], ['RP4'], ['Irms4'], ['PF4']]¶
- prepare_training_data(start, end)[source]¶
Return supervised features and labels for the requested time range.
- Parameters:
start (str) – Inclusive lower timestamp bound.
end (str) – Inclusive upper timestamp bound.
- Returns:
A
TrainingDatasetcontaining active rows only, with labels synchronized from the digital signal stream.- Return type:
- prepare_inference_data(start, end, *, drop_inactive=True)[source]¶
Return inference-ready analog features without requiring digital labels.
- Parameters:
start (str) – Inclusive lower timestamp bound.
end (str) – Inclusive upper timestamp bound.
drop_inactive (bool) – Whether to keep only rows above the activity threshold.
- Returns:
An
InferenceDatasetwith the feature matrix, a boolean mask identifying active rows, and the imputed source analog window.- Return type:
- prepare_evaluation_data(start, end)[source]¶
Return aligned features and optional labels for model evaluation.
- Parameters:
start (str) – Inclusive lower timestamp bound.
end (str) – Inclusive upper timestamp bound.
- Returns:
An
EvaluationDatasetcontaining the active feature matrix, optional real labels aligned to the full analog window, the activity mask, and the imputed source analog frame.- Return type:
- class iabm.StateClassifier(model_type='rf', params=None, translator=None)[source]¶
Bases:
objectHigh-level wrapper around the estimator lifecycle used by Model_A.
The class keeps scaling, label encoding, validation, persistence, and inference in one cohesive object so command-line orchestration stays thin and future model variants can share the same interface.
- Parameters:
model_type (str)
params (Optional[Dict[str, Any]])
translator (Optional[Callable[[str], str]])
- fit(X, y)[source]¶
Fit the scaler and estimator and return the training accuracy.
- Parameters:
X (DataFrame) – Training feature matrix.
y (Series | ndarray) – Original training labels.
- Returns:
In-sample accuracy measured on the fitted training data.
- Return type:
float
- cross_validate(X, y, *, splits=5, shuffle=True, random_state=42)[source]¶
Evaluate the configured estimator with stratified cross-validation.
- Parameters:
X (DataFrame) – Feature matrix.
y (Series | ndarray) – Original state labels before encoding.
splits (int) – Number of folds in the validation scheme.
shuffle (bool) – Whether to shuffle the folds before splitting.
random_state (int) – Seed used when shuffling folds.
- Returns:
A
CrossValidationResultwith per-fold scores and summary statistics.- Return type:
- predict(X)[source]¶
Predict original-state labels for new analog observations.
- Parameters:
X (DataFrame) – Inference feature matrix.
- Returns:
Predicted labels mapped back to the original state identifiers.
- Return type:
ndarray
- predict_proba(X)[source]¶
Return class probabilities aligned with the original label space.
- Parameters:
X (DataFrame) – Inference feature matrix.
- Returns:
A two-dimensional array whose columns follow
self.label_encoder.classes_.- Return type:
ndarray
- save(file_path)[source]¶
Persist the full inference artifact required for later reuse.
The saved payload contains every object needed to run predictions on unseen data without retraining: estimator, scaler, label encoder, and feature ordering metadata.
- Parameters:
file_path (str)
- Return type:
None
- classmethod load(file_path, translator=None)[source]¶
Restore a persisted classifier artifact from disk.
- Parameters:
file_path (str) – Serialized artifact path created with
save().translator (Callable[[str], str] | None) – Optional translation function for user-facing errors.
- Returns:
A ready-to-use
StateClassifierinstance.- Return type:
- class iabm.TrainingDataset(features, labels)[source]¶
Bases:
objectBundle supervised features and labels for classifier training.
The dataclass keeps the public API explicit and avoids passing loosely coupled tuples around the codebase when training workflows evolve.
- Parameters:
features (DataFrame)
labels (Series)
- features: DataFrame¶
- labels: Series¶
Data Processing¶
Data preparation utilities for Model_A industrial-state classifiers.
- class iabm.data_processor.TrainingDataset(features, labels)[source]¶
Bases:
objectBundle supervised features and labels for classifier training.
The dataclass keeps the public API explicit and avoids passing loosely coupled tuples around the codebase when training workflows evolve.
- Parameters:
features (DataFrame)
labels (Series)
- features: DataFrame¶
- labels: Series¶
- class iabm.data_processor.InferenceDataset(features, active_mask, source_frame)[source]¶
Bases:
objectBundle inference-ready features together with activity bookkeeping.
The source frame and activity mask allow the CLI to reconstruct outputs aligned with the original timestamps, including optional inactive rows.
- Parameters:
features (DataFrame)
active_mask (Series)
source_frame (DataFrame)
- features: DataFrame¶
- active_mask: Series¶
- source_frame: DataFrame¶
- class iabm.data_processor.EvaluationDataset(features, labels, active_mask, source_frame)[source]¶
Bases:
objectBundle features, labels, and alignment data for quality assessment.
- Parameters:
features (DataFrame)
labels (Series | None)
active_mask (Series)
source_frame (DataFrame)
- features: DataFrame¶
- labels: Series | None¶
- active_mask: Series¶
- source_frame: DataFrame¶
- class iabm.data_processor.IndustrialDataProcessor(analog_path, digital_path=None, *, threshold=50.0, feature_columns=None)[source]¶
Bases:
objectPrepare industrial analog and digital signals for Model_A workflows.
The processor encapsulates the study-specific preprocessing rules so the rest of the package can work with clean training and inference datasets through a stable, object-oriented interface.
- Parameters:
analog_path (str)
digital_path (Optional[str])
threshold (float)
feature_columns (Optional[Sequence[str]])
- DEFAULT_FEATURE_COLUMNS = ['Vrms1', 'Vrms2', 'Vrms3', 'Irms1', 'Irms2', 'Irms3', 'PF1', 'PF2', 'PF3']¶
- POWER_COLUMNS = ['RP1', 'RP2', 'RP3']¶
- THREE_PHASE_BLOCKS = [['Vrms1', 'Vrms2', 'Vrms3'], ['RP1', 'RP2', 'RP3'], ['Irms1', 'Irms2', 'Irms3'], ['PF1', 'PF2', 'PF3']]¶
- SINGLE_PHASE_BLOCKS = [['Vrms4'], ['RP4'], ['Irms4'], ['PF4']]¶
- prepare_training_data(start, end)[source]¶
Return supervised features and labels for the requested time range.
- Parameters:
start (str) – Inclusive lower timestamp bound.
end (str) – Inclusive upper timestamp bound.
- Returns:
A
TrainingDatasetcontaining active rows only, with labels synchronized from the digital signal stream.- Return type:
- prepare_inference_data(start, end, *, drop_inactive=True)[source]¶
Return inference-ready analog features without requiring digital labels.
- Parameters:
start (str) – Inclusive lower timestamp bound.
end (str) – Inclusive upper timestamp bound.
drop_inactive (bool) – Whether to keep only rows above the activity threshold.
- Returns:
An
InferenceDatasetwith the feature matrix, a boolean mask identifying active rows, and the imputed source analog window.- Return type:
- prepare_evaluation_data(start, end)[source]¶
Return aligned features and optional labels for model evaluation.
- Parameters:
start (str) – Inclusive lower timestamp bound.
end (str) – Inclusive upper timestamp bound.
- Returns:
An
EvaluationDatasetcontaining the active feature matrix, optional real labels aligned to the full analog window, the activity mask, and the imputed source analog frame.- Return type:
Models¶
Model abstractions for Model_A industrial-state classifiers.
- class iabm.models.CrossValidationResult(scores)[source]¶
Bases:
objectSummarize fold-wise validation scores produced by the classifier API.
Exposing the result as a dataclass makes downstream reporting clearer and keeps aggregate statistics close to the original fold-level scores.
- Parameters:
scores (ndarray)
- scores: ndarray¶
- property mean: float¶
Return the average score across all folds.
- property std: float¶
Return the standard deviation across all folds.
- class iabm.models.FoldLabelEncoderClassifier(estimator)[source]¶
Bases:
BaseEstimator,ClassifierMixinWrap an estimator so each fit uses fold-local contiguous class labels.
XGBoost expects class labels presented during
fitto be contiguous integers starting at zero. During cross-validation, some training folds may not contain every class present in the global dataset, which makes a globally encoded target vector invalid for that fold. This wrapper applies a fresh label encoding on every fit and maps predictions back to the original labels expected by scikit-learn scorers.- Parameters:
estimator (BaseEstimator)
- fit(X, y)[source]¶
Fit the wrapped estimator with a fold-local label encoding.
- Parameters:
X (DataFrame | ndarray) – Fold-local feature matrix.
y (Series | ndarray) – Fold-local label vector.
- Returns:
The fitted wrapper instance.
- Return type:
- predict(X)[source]¶
Predict labels and map them back to the original fold label space.
- Parameters:
X (DataFrame | ndarray) – Fold-local feature matrix.
- Returns:
Predictions expressed in the original label space expected by the scoring function.
- Return type:
ndarray
- get_params(deep=True)[source]¶
Expose wrapped-estimator parameters for scikit-learn compatibility.
- Parameters:
deep (bool) – Whether to include nested estimator parameters.
- Returns:
A parameter dictionary compatible with scikit-learn cloning.
- Return type:
Dict[str, Any]
- set_params(**params)[source]¶
Propagate parameter updates to the wrapped estimator when requested.
- Parameters:
**params (Any) – Wrapper or nested estimator parameters.
- Returns:
The updated wrapper instance.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters¶
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
Returns¶
- selfobject
The updated object.
- Parameters:
self (FoldLabelEncoderClassifier)
sample_weight (bool | None | str)
- Return type:
- class iabm.models.StateClassifier(model_type='rf', params=None, translator=None)[source]¶
Bases:
objectHigh-level wrapper around the estimator lifecycle used by Model_A.
The class keeps scaling, label encoding, validation, persistence, and inference in one cohesive object so command-line orchestration stays thin and future model variants can share the same interface.
- Parameters:
model_type (str)
params (Optional[Dict[str, Any]])
translator (Optional[Callable[[str], str]])
- fit(X, y)[source]¶
Fit the scaler and estimator and return the training accuracy.
- Parameters:
X (DataFrame) – Training feature matrix.
y (Series | ndarray) – Original training labels.
- Returns:
In-sample accuracy measured on the fitted training data.
- Return type:
float
- cross_validate(X, y, *, splits=5, shuffle=True, random_state=42)[source]¶
Evaluate the configured estimator with stratified cross-validation.
- Parameters:
X (DataFrame) – Feature matrix.
y (Series | ndarray) – Original state labels before encoding.
splits (int) – Number of folds in the validation scheme.
shuffle (bool) – Whether to shuffle the folds before splitting.
random_state (int) – Seed used when shuffling folds.
- Returns:
A
CrossValidationResultwith per-fold scores and summary statistics.- Return type:
- predict(X)[source]¶
Predict original-state labels for new analog observations.
- Parameters:
X (DataFrame) – Inference feature matrix.
- Returns:
Predicted labels mapped back to the original state identifiers.
- Return type:
ndarray
- predict_proba(X)[source]¶
Return class probabilities aligned with the original label space.
- Parameters:
X (DataFrame) – Inference feature matrix.
- Returns:
A two-dimensional array whose columns follow
self.label_encoder.classes_.- Return type:
ndarray
- save(file_path)[source]¶
Persist the full inference artifact required for later reuse.
The saved payload contains every object needed to run predictions on unseen data without retraining: estimator, scaler, label encoder, and feature ordering metadata.
- Parameters:
file_path (str)
- Return type:
None
- classmethod load(file_path, translator=None)[source]¶
Restore a persisted classifier artifact from disk.
- Parameters:
file_path (str) – Serialized artifact path created with
save().translator (Callable[[str], str] | None) – Optional translation function for user-facing errors.
- Returns:
A ready-to-use
StateClassifierinstance.- Return type:
Command-Line Interface¶
Command-line entry point for training and using Model_A classifiers.
- iabm.main.parse_arguments(translator)[source]¶
Build the CLI parser with translated help messages.
- Parameters:
translator (Callable[[str], str]) – Translation function returned by
setup_i18n().- Returns:
Parsed command-line arguments ready to drive the main workflow.
- Return type:
Namespace
- iabm.main.main()[source]¶
Run the end-to-end Model_A workflow for training or prediction.
The entry point keeps orchestration concerns in one place while delegating data preparation and model lifecycle logic to their respective classes.
Training mode prepares labeled features, runs cross-validation, fits the final classifier, and persists both the model artifact and fold metrics. Prediction mode loads a previously trained artifact and applies it to a new analog time window without requiring digital labels at inference time.
- Return type:
None
Utilities¶
Internationalization helpers for the Model_A command-line interface.
- iabm.utils.setup_i18n(lang='en')[source]¶
Return a translation function for the requested interface language.
The project stores human-maintained translations in
locales/*/LC_MESSAGESas.pofiles. This helper reads those catalogs directly so the CLI can be translated even when.mofiles have not been compiled yet.- Parameters:
lang (str) – ISO language code requested by the user.
- Returns:
A callable compatible with
gettextusage that translates a message identifier into the configured language. English falls back to the original message identifiers.- Return type:
Callable[[str], str]