Model_B API

Package iabm_behavior

Reusable components for Model_B behavioral sequence analysis.

class iabm_behavior.ActiveSequence(start_time, end_time, states, total_duration_seconds, run_count)[source]

Bases: object

Represent a higher-level active behavioral sequence.

Parameters:
  • start_time (Timestamp)

  • end_time (Timestamp)

  • states (tuple[int, ...])

  • total_duration_seconds (float)

  • run_count (int)

start_time

Timestamp of the first active run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

end_time

Timestamp of the last active run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

states

Ordered tuple of state identifiers composing the sequence.

Type:

tuple[int, …]

total_duration_seconds

Total duration of the sequence in seconds.

Type:

float

run_count

Number of state runs contained in the sequence.

Type:

int

start_time: Timestamp
end_time: Timestamp
states: tuple[int, ...]
total_duration_seconds: float
run_count: int
class iabm_behavior.BehavioralSequenceAnalyzer(state_column='Predicted_State')[source]

Bases: object

Load state timelines and derive run- and sequence-level behavior features.

The analyzer transforms Model_A predictions or digital state traces into contiguous runs and active state sequences. It also includes a lightweight smoothing step inspired by the original legacy scripts, where very short transient runs can be merged into the following state to reduce noise.

Parameters:

state_column (str)

load_state_timeline(file_path)[source]

Load a time-indexed state timeline from CSV, Excel, or Parquet.

Parameters:

file_path (str | Path) – Path to the state timeline file.

Returns:

A DataFrame indexed by timestamps and containing the configured state column.

Return type:

DataFrame

smooth_short_runs(timeline, *, min_duration_seconds=1.0, min_samples=1)[source]

Merge short transient runs into the following run when possible.

Parameters:
  • timeline (DataFrame) – Input state timeline.

  • min_duration_seconds (float) – Maximum duration treated as transient noise.

  • min_samples (int) – Maximum sample count treated as transient noise.

Returns:

A copy of the timeline with eligible short runs reassigned.

Return type:

DataFrame

extract_runs(timeline)[source]

Convert a state timeline into contiguous state runs.

Parameters:

timeline (DataFrame) – Time-indexed DataFrame containing the configured state column.

Returns:

A list of StateRun objects ordered by time.

Return type:

list[StateRun]

extract_active_sequences(timeline)[source]

Group consecutive non-zero runs into active behavioral sequences.

Parameters:

timeline (DataFrame) – Time-indexed DataFrame containing state values.

Returns:

A list of active sequences. Each sequence stores the ordered state pattern and its total duration.

Return type:

list[ActiveSequence]

summarize_sequence_words(sequences)[source]

Count repeated behavioral words extracted from active sequences.

Parameters:

sequences (Iterable[ActiveSequence]) – Iterable of active sequences.

Returns:

A DataFrame with the sequence word, occurrence count, and average duration.

Return type:

DataFrame

build_nominal_reference(sequences)[source]

Build nominal references from repeated behavioral words.

Parameters:

sequences (Iterable[ActiveSequence]) – Historical active sequences representing nominal behavior.

Returns:

A list of nominal references sorted by occurrence count and duration.

Return type:

list[NominalSequenceReference]

compare_to_nominal(observed_sequences, nominal_references, *, anomaly_threshold=1.0)[source]

Compare observed sequences against the closest nominal references.

Parameters:
  • observed_sequences (Iterable[ActiveSequence]) – Sequences extracted from the timeline under study.

  • nominal_references (Iterable[NominalSequenceReference]) – Reference words representing nominal behavior.

  • anomaly_threshold (float) – Score threshold used to flag anomalous sequences.

Returns:

A DataFrame where each row quantifies the difference between an observed sequence and its closest nominal counterpart.

Return type:

DataFrame

class iabm_behavior.NominalSequenceReference(states, count, avg_duration_seconds)[source]

Bases: object

Represent a nominal behavioral word learned from historical sequences.

Parameters:
  • states (tuple[int, ...])

  • count (int)

  • avg_duration_seconds (float)

states

Ordered state tuple that defines the nominal word.

Type:

tuple[int, …]

count

Number of times the word appears in the reference dataset.

Type:

int

avg_duration_seconds

Mean duration of the word across occurrences.

Type:

float

states: tuple[int, ...]
count: int
avg_duration_seconds: float
class iabm_behavior.SequenceComparison(observed_states, nominal_states, exact_match, state_distance, dtw_distance, duration_ratio_delta, anomaly_score, is_anomalous)[source]

Bases: object

Store the comparison of an observed sequence against a nominal reference.

Parameters:
  • observed_states (tuple[int, ...])

  • nominal_states (tuple[int, ...])

  • exact_match (bool)

  • state_distance (int)

  • dtw_distance (float)

  • duration_ratio_delta (float)

  • anomaly_score (float)

  • is_anomalous (bool)

observed_states

Observed sequence word.

Type:

tuple[int, …]

nominal_states

Closest nominal word found in the reference set.

Type:

tuple[int, …]

exact_match

Whether both words are identical.

Type:

bool

state_distance

Discrete distance between words using edit distance.

Type:

int

dtw_distance

Alignment distance between the observed and nominal words.

Type:

float

duration_ratio_delta

Relative duration deviation against the nominal word.

Type:

float

anomaly_score

Aggregate score combining state mismatch and duration drift.

Type:

float

is_anomalous

Whether the aggregate score exceeds the configured threshold.

Type:

bool

observed_states: tuple[int, ...]
nominal_states: tuple[int, ...]
exact_match: bool
state_distance: int
dtw_distance: float
duration_ratio_delta: float
anomaly_score: float
is_anomalous: bool
class iabm_behavior.StateRun(state, start_time, end_time, sample_count, duration_seconds)[source]

Bases: object

Represent a contiguous run of the same predicted or measured state.

Parameters:
  • state (int)

  • start_time (Timestamp)

  • end_time (Timestamp)

  • sample_count (int)

  • duration_seconds (float)

state

State identifier associated with the run.

Type:

int

start_time

Timestamp of the first sample in the run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

end_time

Timestamp of the last sample in the run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

sample_count

Number of rows belonging to the run.

Type:

int

duration_seconds

Elapsed time covered by the run.

Type:

float

state: int
start_time: Timestamp
end_time: Timestamp
sample_count: int
duration_seconds: float

Sequence Analysis

Sequence-analysis utilities for Model_B behavioral modeling.

class iabm_behavior.sequences.StateRun(state, start_time, end_time, sample_count, duration_seconds)[source]

Bases: object

Represent a contiguous run of the same predicted or measured state.

Parameters:
  • state (int)

  • start_time (Timestamp)

  • end_time (Timestamp)

  • sample_count (int)

  • duration_seconds (float)

state

State identifier associated with the run.

Type:

int

start_time

Timestamp of the first sample in the run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

end_time

Timestamp of the last sample in the run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

sample_count

Number of rows belonging to the run.

Type:

int

duration_seconds

Elapsed time covered by the run.

Type:

float

state: int
start_time: Timestamp
end_time: Timestamp
sample_count: int
duration_seconds: float
class iabm_behavior.sequences.ActiveSequence(start_time, end_time, states, total_duration_seconds, run_count)[source]

Bases: object

Represent a higher-level active behavioral sequence.

Parameters:
  • start_time (Timestamp)

  • end_time (Timestamp)

  • states (tuple[int, ...])

  • total_duration_seconds (float)

  • run_count (int)

start_time

Timestamp of the first active run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

end_time

Timestamp of the last active run.

Type:

pandas._libs.tslibs.timestamps.Timestamp

states

Ordered tuple of state identifiers composing the sequence.

Type:

tuple[int, …]

total_duration_seconds

Total duration of the sequence in seconds.

Type:

float

run_count

Number of state runs contained in the sequence.

Type:

int

start_time: Timestamp
end_time: Timestamp
states: tuple[int, ...]
total_duration_seconds: float
run_count: int
class iabm_behavior.sequences.NominalSequenceReference(states, count, avg_duration_seconds)[source]

Bases: object

Represent a nominal behavioral word learned from historical sequences.

Parameters:
  • states (tuple[int, ...])

  • count (int)

  • avg_duration_seconds (float)

states

Ordered state tuple that defines the nominal word.

Type:

tuple[int, …]

count

Number of times the word appears in the reference dataset.

Type:

int

avg_duration_seconds

Mean duration of the word across occurrences.

Type:

float

states: tuple[int, ...]
count: int
avg_duration_seconds: float
class iabm_behavior.sequences.SequenceComparison(observed_states, nominal_states, exact_match, state_distance, dtw_distance, duration_ratio_delta, anomaly_score, is_anomalous)[source]

Bases: object

Store the comparison of an observed sequence against a nominal reference.

Parameters:
  • observed_states (tuple[int, ...])

  • nominal_states (tuple[int, ...])

  • exact_match (bool)

  • state_distance (int)

  • dtw_distance (float)

  • duration_ratio_delta (float)

  • anomaly_score (float)

  • is_anomalous (bool)

observed_states

Observed sequence word.

Type:

tuple[int, …]

nominal_states

Closest nominal word found in the reference set.

Type:

tuple[int, …]

exact_match

Whether both words are identical.

Type:

bool

state_distance

Discrete distance between words using edit distance.

Type:

int

dtw_distance

Alignment distance between the observed and nominal words.

Type:

float

duration_ratio_delta

Relative duration deviation against the nominal word.

Type:

float

anomaly_score

Aggregate score combining state mismatch and duration drift.

Type:

float

is_anomalous

Whether the aggregate score exceeds the configured threshold.

Type:

bool

observed_states: tuple[int, ...]
nominal_states: tuple[int, ...]
exact_match: bool
state_distance: int
dtw_distance: float
duration_ratio_delta: float
anomaly_score: float
is_anomalous: bool
class iabm_behavior.sequences.BehavioralSequenceAnalyzer(state_column='Predicted_State')[source]

Bases: object

Load state timelines and derive run- and sequence-level behavior features.

The analyzer transforms Model_A predictions or digital state traces into contiguous runs and active state sequences. It also includes a lightweight smoothing step inspired by the original legacy scripts, where very short transient runs can be merged into the following state to reduce noise.

Parameters:

state_column (str)

load_state_timeline(file_path)[source]

Load a time-indexed state timeline from CSV, Excel, or Parquet.

Parameters:

file_path (str | Path) – Path to the state timeline file.

Returns:

A DataFrame indexed by timestamps and containing the configured state column.

Return type:

DataFrame

smooth_short_runs(timeline, *, min_duration_seconds=1.0, min_samples=1)[source]

Merge short transient runs into the following run when possible.

Parameters:
  • timeline (DataFrame) – Input state timeline.

  • min_duration_seconds (float) – Maximum duration treated as transient noise.

  • min_samples (int) – Maximum sample count treated as transient noise.

Returns:

A copy of the timeline with eligible short runs reassigned.

Return type:

DataFrame

extract_runs(timeline)[source]

Convert a state timeline into contiguous state runs.

Parameters:

timeline (DataFrame) – Time-indexed DataFrame containing the configured state column.

Returns:

A list of StateRun objects ordered by time.

Return type:

list[StateRun]

extract_active_sequences(timeline)[source]

Group consecutive non-zero runs into active behavioral sequences.

Parameters:

timeline (DataFrame) – Time-indexed DataFrame containing state values.

Returns:

A list of active sequences. Each sequence stores the ordered state pattern and its total duration.

Return type:

list[ActiveSequence]

summarize_sequence_words(sequences)[source]

Count repeated behavioral words extracted from active sequences.

Parameters:

sequences (Iterable[ActiveSequence]) – Iterable of active sequences.

Returns:

A DataFrame with the sequence word, occurrence count, and average duration.

Return type:

DataFrame

build_nominal_reference(sequences)[source]

Build nominal references from repeated behavioral words.

Parameters:

sequences (Iterable[ActiveSequence]) – Historical active sequences representing nominal behavior.

Returns:

A list of nominal references sorted by occurrence count and duration.

Return type:

list[NominalSequenceReference]

compare_to_nominal(observed_sequences, nominal_references, *, anomaly_threshold=1.0)[source]

Compare observed sequences against the closest nominal references.

Parameters:
  • observed_sequences (Iterable[ActiveSequence]) – Sequences extracted from the timeline under study.

  • nominal_references (Iterable[NominalSequenceReference]) – Reference words representing nominal behavior.

  • anomaly_threshold (float) – Score threshold used to flag anomalous sequences.

Returns:

A DataFrame where each row quantifies the difference between an observed sequence and its closest nominal counterpart.

Return type:

DataFrame

Command-Line Interface

Command-line entry point for Model_B behavioral sequence analysis.

iabm_behavior.main.parse_arguments(translator)[source]

Build the CLI parser with translated help messages.

Parameters:

translator (Callable[[str], str]) – Translation function returned by setup_i18n().

Returns:

Parsed command-line arguments driving the sequence-analysis workflow.

Return type:

Namespace

iabm_behavior.main.main()[source]

Run the Model_B sequence-analysis workflow from the command line.

The workflow can operate in two levels. At minimum, it extracts runs, active sequences, and repeated sequence words from a state timeline. When a nominal timeline is also provided, it derives a nominal reference set and produces an anomaly-oriented comparison report.

Return type:

None

Utilities

Internationalization helpers for the Model_B command-line interface.

iabm_behavior.utils.setup_i18n(lang='en')[source]

Return a translation function for the requested interface language.

Parameters:

lang (str) – ISO language code requested by the caller.

Returns:

A callable that translates user-facing CLI strings into the requested language, falling back to the original message when no translation exists.

Return type:

Callable[[str], str]