Skip to content

data_config_schema

Module for defining the data config schema.

Classes:

Columns

Bases: BaseModel

Model for column configuration.

ColumnsEncoder

Bases: BaseModel

Model for column encoder configuration.

ConfigDict

Bases: BaseModel

Model for main YAML configuration.

GlobalParams

Bases: BaseModel

Model for global parameters in YAML configuration.

Schema

Bases: BaseModel

Model for validating YAML schema.

Split

Bases: BaseModel

Model for split configuration.

SplitConfigDict

Bases: BaseModel

Model for sub-configuration generated from main config.

SplitSchema

Bases: BaseModel

Model for validating a Split YAML schema.

SplitTransformDict

Bases: BaseModel

Model for sub-configuration generated from main config.

Transform

Bases: BaseModel

Model for transform configuration.

Methods:

validate_param_lists_across_columns classmethod

validate_param_lists_across_columns(
    columns: list[TransformColumns],
) -> list[TransformColumns]

Validate that parameter lists across columns have consistent lengths.

Parameters:

Returns:

Source code in src/stimulus/data/interface/data_config_schema.py
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
@field_validator("columns")
@classmethod
def validate_param_lists_across_columns(
    cls,
    columns: list[TransformColumns],
) -> list[TransformColumns]:
    """Validate that parameter lists across columns have consistent lengths.

    Args:
        columns: List of transform columns to validate

    Returns:
        The validated columns list
    """
    # Get all parameter list lengths across all columns and transformations
    all_list_lengths: set[int] = set()

    for column in columns:
        for transformation in column.transformations:
            if transformation.params and any(
                isinstance(param_value, list) and len(param_value) > 0
                for param_value in transformation.params.values()
            ):
                all_list_lengths.update(
                    len(param_value)
                    for param_value in transformation.params.values()
                    if isinstance(param_value, list) and len(param_value) > 0
                )

    # Skip validation if no lists found
    if not all_list_lengths:
        return columns

    # Check if all lists either have length 1, or all have the same length
    all_list_lengths.discard(1)  # Remove length 1 as it's always valid
    if len(all_list_lengths) > 1:  # Multiple different lengths found
        raise ValueError(
            "All parameter lists across columns must either contain one element or have the same length",
        )

    return columns

TransformColumns

Bases: BaseModel

Model for transform columns configuration.

TransformColumnsTransformation

Bases: BaseModel

Model for column transformation configuration.