Skip to content

Model Schema

Schema definitions for validating supported model architectures and layer blocks.

UnitFloat module-attribute

UnitFloat: TypeAlias = Annotated[
    float, Field(ge=0.0, le=1.0)
]

A float that must be in range of [0, 1].

PaddingTypeStr module-attribute

PaddingTypeStr: TypeAlias = Literal['valid', 'same']

Padding modes supported by Keras convolution and pooling layers.

PositiveIntTuple module-attribute

PositiveIntTuple: TypeAlias = (
    PositiveInt | tuple[PositiveInt, PositiveInt]
)

A single positive integer or a tuple of two positive integers, usually used for sizes/strides.

NormalizationStr module-attribute

NormalizationStr: TypeAlias = Literal[
    "layer_norm", "rms_norm", "dyt"
]

Available normalization layers.

ActivationStr module-attribute

ActivationStr: TypeAlias = Literal[
    "celu",
    "elu",
    "exponential",
    "gelu",
    "glu",
    "hard_shrink",
    "hard_sigmoid",
    "hard_silu",
    "hard_tanh",
    "leaky_relu",
    "linear",
    "log_sigmoid",
    "log_softmax",
    "mish",
    "relu",
    "relu6",
    "selu",
    "sigmoid",
    "silu",
    "soft_shrink",
    "softmax",
    "softplus",
    "softsign",
    "sparse_plus",
    "sparsemax",
    "squareplus",
    "tanh",
    "tanh_shrink",
    "threshold",
]

Supported Keras activation functions.

WeightInitializationStr module-attribute

WeightInitializationStr: TypeAlias = Literal[
    "glorot_normal",
    "glorot_uniform",
    "he_normal",
    "he_uniform",
    "lecun_normal",
    "lecun_uniform",
    "ones",
    "random_normal",
    "random_uniform",
    "truncated_normal",
    "variance_scaling",
    "zeros",
]

Keras weight initialization strategies.

AnyModelConfig module-attribute

AnyModelConfig = Annotated[
    CCTModelConfig, Field(discriminator="model")
]

Supported model-architecture. New model configs should be added here.

_Rescaling

Bases: BaseModel

scale class-attribute instance-attribute

scale: float = 1.0 / 255

offset class-attribute instance-attribute

offset: float = 0.0

to_keras_layer

to_keras_layer()
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self):
    return keras.layers.Rescaling(self.scale, self.offset)

_Activation

Bases: BaseModel

layer instance-attribute

layer: Literal['Activation']

activation instance-attribute

activation: ActivationStr

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.Activation(self.activation)

_Conv2DBase

Bases: BaseModel

filters instance-attribute

filters: PositiveInt

kernel_size instance-attribute

kernel_size: PositiveIntTuple

strides class-attribute instance-attribute

strides: PositiveIntTuple = 1

padding class-attribute instance-attribute

padding: PaddingTypeStr = 'same'

activation class-attribute instance-attribute

activation: ActivationStr = 'relu'

use_bias class-attribute instance-attribute

use_bias: bool = True

kernel_initializer class-attribute instance-attribute

kernel_initializer: WeightInitializationStr = 'he_normal'

bias_initializer class-attribute instance-attribute

bias_initializer: WeightInitializationStr = 'zeros'

_Conv2D

Bases: _Conv2DBase

layer instance-attribute

layer: Literal['Conv2D']

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    params = self.model_dump(exclude={"layer"})
    return keras.layers.Conv2D(**params)

_CoordConv2D

Bases: _Conv2DBase

layer instance-attribute

layer: Literal['CoordConv2D']

with_r class-attribute instance-attribute

with_r: bool = False

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    params = self.model_dump(exclude={"layer", "with_r"})
    return CoordConv2D(with_r=self.with_r, **params)

_DepthwiseConv2D

Bases: BaseModel

layer instance-attribute

layer: Literal['DepthwiseConv2D']

kernel_size instance-attribute

kernel_size: PositiveIntTuple

strides class-attribute instance-attribute

strides: PositiveIntTuple = 1

padding class-attribute instance-attribute

padding: PaddingTypeStr = 'same'

depth_multiplier class-attribute instance-attribute

depth_multiplier: PositiveInt = 1

activation class-attribute instance-attribute

activation: ActivationStr = 'relu'

use_bias class-attribute instance-attribute

use_bias: bool = True

depthwise_initializer class-attribute instance-attribute

depthwise_initializer: WeightInitializationStr = "he_normal"

bias_initializer class-attribute instance-attribute

bias_initializer: WeightInitializationStr = 'zeros'

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.DepthwiseConv2D(
        kernel_size=self.kernel_size,
        strides=self.strides,
        padding=self.padding,
        depth_multiplier=self.depth_multiplier,
        activation=self.activation,
        use_bias=self.use_bias,
        depthwise_initializer=self.depthwise_initializer,
        bias_initializer=self.bias_initializer,
    )

_SeparableConv2D

Bases: BaseModel

layer instance-attribute

layer: Literal['SeparableConv2D']

filters instance-attribute

filters: PositiveInt

kernel_size instance-attribute

kernel_size: PositiveIntTuple

strides class-attribute instance-attribute

strides: PositiveIntTuple = 1

padding class-attribute instance-attribute

padding: PaddingTypeStr = 'same'

depth_multiplier class-attribute instance-attribute

depth_multiplier: PositiveInt = 1

activation class-attribute instance-attribute

activation: ActivationStr = 'relu'

use_bias class-attribute instance-attribute

use_bias: bool = True

depthwise_initializer class-attribute instance-attribute

depthwise_initializer: WeightInitializationStr = "he_normal"

pointwise_initializer class-attribute instance-attribute

pointwise_initializer: WeightInitializationStr = (
    "glorot_uniform"
)

bias_initializer class-attribute instance-attribute

bias_initializer: WeightInitializationStr = 'zeros'

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.SeparableConv2D(
        filters=self.filters,
        kernel_size=self.kernel_size,
        strides=self.strides,
        padding=self.padding,
        depth_multiplier=self.depth_multiplier,
        activation=self.activation,
        use_bias=self.use_bias,
        depthwise_initializer=self.depthwise_initializer,
        pointwise_initializer=self.pointwise_initializer,
        bias_initializer=self.bias_initializer,
    )

_MLP

Bases: BaseModel

layer instance-attribute

layer: Literal['MLP']

hidden_units instance-attribute

hidden_units: list[PositiveInt]

dropout_rate class-attribute instance-attribute

dropout_rate: UnitFloat = 0.1

activation class-attribute instance-attribute

activation: ActivationStr = 'gelu'

use_bias class-attribute instance-attribute

use_bias: bool = True

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return MLP(
        hidden_units=self.hidden_units,
        dropout_rate=self.dropout_rate,
        activation=self.activation,
        use_bias=self.use_bias,
    )

_MaxBlurPooling2D

Bases: BaseModel

layer instance-attribute

layer: Literal['MaxBlurPooling2D']

pool_size class-attribute instance-attribute

pool_size: PositiveInt = 2

filter_size class-attribute instance-attribute

filter_size: PositiveInt = 3

padding class-attribute instance-attribute

padding: PaddingTypeStr = 'same'

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return MaxBlurPooling2D(
        pool_size=self.pool_size, filter_size=self.filter_size, padding=self.padding
    )

_MaxPooling2D

Bases: BaseModel

layer instance-attribute

layer: Literal['MaxPooling2D']

pool_size class-attribute instance-attribute

pool_size: PositiveIntTuple = 2

strides class-attribute instance-attribute

strides: PositiveInt | None = None

padding class-attribute instance-attribute

padding: PaddingTypeStr = 'valid'

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.MaxPooling2D(
        pool_size=self.pool_size,
        strides=self.strides,
        padding=self.padding,
    )

_AveragePooling2D

Bases: BaseModel

layer instance-attribute

layer: Literal['AveragePooling2D']

pool_size class-attribute instance-attribute

pool_size: PositiveIntTuple = 2

strides class-attribute instance-attribute

strides: PositiveInt | None = None

padding class-attribute instance-attribute

padding: PaddingTypeStr = 'valid'

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.AveragePooling2D(
        pool_size=self.pool_size,
        strides=self.strides,
        padding=self.padding,
    )

_ZeroPadding2D

Bases: BaseModel

layer instance-attribute

layer: Literal['ZeroPadding2D']

padding class-attribute instance-attribute

padding: PositiveIntTuple = 1

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.ZeroPadding2D(padding=self.padding)

_SqueezeExcite

Bases: BaseModel

layer instance-attribute

layer: Literal['SqueezeExcite']

ratio class-attribute instance-attribute

ratio: PositiveFloat = 1.0

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return SqueezeExcite(ratio=self.ratio)

_BatchNormalization

Bases: BaseModel

layer instance-attribute

layer: Literal['BatchNormalization']

momentum class-attribute instance-attribute

momentum: PositiveFloat = 0.99

epsilon class-attribute instance-attribute

epsilon: PositiveFloat = 0.001

center class-attribute instance-attribute

center: bool = True

scale class-attribute instance-attribute

scale: bool = True

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.BatchNormalization(
        momentum=self.momentum,
        epsilon=self.epsilon,
        center=self.center,
        scale=self.scale,
    )

_Dropout

Bases: BaseModel

layer instance-attribute

layer: Literal['Dropout']

rate instance-attribute

rate: PositiveFloat

to_keras_layer

to_keras_layer()
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self):
    return keras.layers.Dropout(rate=self.rate)

_SpatialDropout2D

Bases: BaseModel

layer instance-attribute

layer: Literal['SpatialDropout2D']

rate instance-attribute

rate: PositiveFloat

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.SpatialDropout2D(rate=self.rate)

_GaussianNoise

Bases: BaseModel

layer instance-attribute

layer: Literal['GaussianNoise']

stddev instance-attribute

stddev: PositiveFloat

seed class-attribute instance-attribute

seed: int | None = None

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.GaussianNoise(stddev=self.stddev, seed=self.seed)

_LayerNorm

Bases: BaseModel

layer instance-attribute

layer: Literal['LayerNorm']

epsilon class-attribute instance-attribute

epsilon: PositiveFloat = 0.001

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return keras.layers.LayerNormalization(epsilon=self.epsilon)

_RMSNorm

Bases: BaseModel

layer instance-attribute

layer: Literal['RMSNorm']

epsilon class-attribute instance-attribute

epsilon: PositiveFloat = 1e-06

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return RMSNormalization(epsilon=self.epsilon)

_DyT

Bases: BaseModel

layer instance-attribute

layer: Literal['DyT']

alpha_init_value class-attribute instance-attribute

alpha_init_value: PositiveFloat = 0.5

to_keras_layer

to_keras_layer() -> Layer
Source code in fast_plate_ocr/train/model/model_schema.py
def to_keras_layer(self) -> keras.layers.Layer:
    return DyT(alpha_init_value=self.alpha_init_value)

_CCTTokenizerConfig

Bases: BaseModel

blocks instance-attribute

blocks: list[LayerConfig]

patch_size class-attribute instance-attribute

patch_size: PositiveIntTuple = 1

patch_mlp class-attribute instance-attribute

patch_mlp: _MLP | None = None

positional_emb class-attribute instance-attribute

positional_emb: bool = True

_CCTTransformerEncoderConfig

Bases: BaseModel

layers instance-attribute

layers: PositiveInt

heads instance-attribute

heads: PositiveInt

projection_dim instance-attribute

projection_dim: PositiveInt

units instance-attribute

units: list[PositiveInt]

activation class-attribute instance-attribute

activation: ActivationStr = 'gelu'

stochastic_depth class-attribute instance-attribute

stochastic_depth: UnitFloat = 0.1

attention_dropout class-attribute instance-attribute

attention_dropout: UnitFloat = 0.1

mlp_dropout class-attribute instance-attribute

mlp_dropout: UnitFloat = 0.1

head_mlp_dropout class-attribute instance-attribute

head_mlp_dropout: UnitFloat = 0.2

token_reducer_heads class-attribute instance-attribute

token_reducer_heads: PositiveInt = 2

normalization class-attribute instance-attribute

normalization: NormalizationStr = 'layer_norm'

_consistency_checks

_consistency_checks()
Source code in fast_plate_ocr/train/model/model_schema.py
@model_validator(mode="after")
def _consistency_checks(self):
    if self.units[-1] != self.projection_dim:
        raise ValueError(
            "'units[-1]' must equal 'projection_dim' "
            f"(got {self.units[-1]} vs {self.projection_dim})."
        )
    return self

CCTModelConfig

Bases: BaseModel

model class-attribute instance-attribute

model: Literal['cct'] = 'cct'

rescaling instance-attribute

rescaling: _Rescaling

tokenizer instance-attribute

transformer_encoder instance-attribute

transformer_encoder: _CCTTransformerEncoderConfig

load_model_config_from_yaml

load_model_config_from_yaml(
    yaml_path: PathLike,
) -> AnyModelConfig

Loads, parses, and validates a YAML file defining a model architecture.

Parameters:

Name Type Description Default
yaml_path PathLike

Path to the YAML file.

required

Returns:

Name Type Description
AnyModelConfig AnyModelConfig

Parsed and validated model configuration.

Raises:

Type Description
FileNotFoundError

If the YAML file does not exist.

Source code in fast_plate_ocr/train/model/model_schema.py
def load_model_config_from_yaml(yaml_path: PathLike) -> AnyModelConfig:
    """
    Loads, parses, and validates a YAML file defining a model architecture.

    Args:
        yaml_path: Path to the YAML file.

    Returns:
        AnyModelConfig: Parsed and validated model configuration.

    Raises:
        FileNotFoundError: If the YAML file does not exist.
    """
    if not Path(yaml_path).is_file():
        raise FileNotFoundError(f"Model config '{yaml_path}' doesn't exist.")
    with open(yaml_path, encoding="utf-8") as f_in:
        data = yaml.safe_load(f_in)
    return AnyModelConfig(**data)