sign_language_translator.models.utils module

Utility functions and classes for PyTorch models.

This module contains various utility functions and classes to assist with PyTorch models and their training.

Functions:

top_p_top_k_indexes(probabilities, top_p=None, top_k=None):: Perform top-p (nucleus) and top-k filtering based on the given probabilities.
plot_lr_scheduler(*args, lr_scheduler_class=None, lr_scheduler_object=None, initial_lr=1e-3, n_steps=20, parameter_group_number=0, **kwargs):: Plot the learning rate of a specific parameter group across training steps.
downwards_wave(n_waves, n_steps_per_wave=9, start=1e-3, end=1e-7, amplitude=0.25):: Generate a downwards wave pattern with a combination of sine wave and linear function.
set_layers_trainability_(model, layers_to_unfreeze=None, layers_to_freeze=None):: Set the trainability of specified layers in the given PyTorch model.

Classes:

FullyLambdaLR(torch.optim.lr_scheduler.LRScheduler):: Sets the learning rate of each parameter group to a given function that takes step_num, base_lr & last_lr as args.
VideoEmbeddingPipeline(slt.models.VideoEmbeddingModel):: With optional multiprocessing, reads video files from paths, performs forward pass of a model on them and saves the output in specified format.

class sign_language_translator.models.utils.FullyLambdaLR(optimizer, lr_lambda: Callable[[int, float, float], float], last_epoch=-1, verbose=False)[source]

Bases: LRScheduler

Sets the learning rate of each parameter group to a given function that takes step_num, base_lr, and last_lr as parameters. When last_epoch=-1, sets initial lr as lr.

Parameters:

optimizer (Optimizer) – Wrapped optimizer.
lr_lambda (function) – A function which computes the learning rate given an integer parameter step_num, initial learning rate and previous learning rate for each group in optimizer.param_groups.
last_epoch (int) – The index of last epoch. Default: -1.
verbose (bool) – If True, prints a message to stdout for each update. Default: False.

Example:

scheduler = FullyLambdaLR(
    optimizer,
    lambda step_num, base_lr, last_lr: last_lr * (1.08 if step_num%2 == 0 else 0.8)
)
for epoch in range(100):
    train(...)
    validate(...)
    scheduler.step()
    print(scheduler.get_last_lr()[0])

get_lr()[source]

class sign_language_translator.models.utils.VideoEmbeddingPipeline(model: VideoEmbeddingModel)[source]

Bases: object

A class for processing and embedding video data using a slt.models.VideoEmbeddingModel.

Parameters:: model (VideoEmbeddingModel) – An instance of the VideoEmbeddingModel class or its child class used for embedding.

process_video(path, save_format='csv', overwrite=False, output_dir='.', **kwargs)[source]: Load, embed, and save a video’s embedding. kwargs are passed to model.embed().

process_videos_parallel(path_patterns, n_processes=multiprocessing.cpu_count(),: save_format=”csv”, overwrite=False, output_dir=”.”, **kwargs):

Process multiple videos in parallel using multiprocessing. kwargs are passed to model.embed().

model

The VideoEmbeddingModel instance used for embedding.

Type:: VideoEmbeddingModel

process_video(path, save_format='csv', overwrite=False, output_dir='.', **kwargs)[source]

Load a video, embed its frames, and save the embedding.

Parameters:

path (str) – The path to the video file.
save_format (str, optional) – Format for saving the embedding (“csv”, “torch”, “npy”, “npz”).
overwrite (bool, optional) – Whether to overwrite existing embedding files.
output_dir (str, optional) – Directory to save the embedding file.
**kwargs – Additional keyword arguments for embedding model.

Returns:

None

process_videos_parallel(path_patterns: List[str], n_processes=2, save_format='csv', overwrite=False, output_dir='.', **kwargs)[source]

Process multiple videos in parallel using multiprocessing, embedding their frames.

Parameters:

path_patterns (list) – List of file path patterns to match videos e.g. [“dataset/.mp4”, “dataset/.avi”].
n_processes (int, optional) – Number of parallel processes. Defaults to multiprocessing.cpu_count().
save_format (str, optional) – Format for saving the embeddings (“csv”, “torch”, “npy”, “npz”).
overwrite (bool, optional) – Whether to overwrite existing embedding files.
output_dir (str, optional) – Directory to save the embedding files.
**kwargs – Additional keyword arguments for embedding model.

Returns:

None

sign_language_translator.models.utils.downwards_wave(n_waves: int, n_steps_per_wave: int = 9, start: float = 0.001, end: float = 1e-07, amplitude: float = 0.25) → ndarray[source]

Generate a downwards wave pattern with a combination of sine wave and linear function.

The function generates a sequence of points forming a downward wave pattern, which consists of a combination of a sine wave and a linear function. The sine wave is modulated by the linear function to create a gradual decrease in axis of the waves.

Parameters:

n_waves (int) – Number of peaks/cycles to generate.
n_steps_per_wave (int, optional) – Number of steps per wave. Default is 9.
start (float, optional) – Starting value for the linear function. Default is 1e-3.
end (float, optional) – Ending value for the linear function. Default is 1e-7.
amplitude (float, optional) – Amplitude of the sine wave. Default is 0.25.

Returns:

Array containing the y-axis values of downwards wave pattern.

Return type:

numpy.ndarray

sign_language_translator.models.utils.plot_lr_scheduler(*args, lr_scheduler_class: Type[LRScheduler] | None = None, lr_scheduler_object: LRScheduler | None = None, initial_lr: float = 0.001, n_steps: int = 20, parameter_group_number: int = 0, save_fig: bool = False, fig_name: str | None = None, **kwargs)[source]

Plot the learning rate of a specific parameter group across training steps.

This function generates a plot to visualize how the learning rate of a specified parameter group changes across training steps. It requires either an existing lr_scheduler_object or a combination of lr_scheduler_class, args, and kwargs to create a new learning rate scheduler.

Parameters:

lr_scheduler_class (Type[torch.optim.lr_scheduler._LRScheduler], optional) – The class of the learning rate scheduler. Defaults to None.
lr_scheduler_object (torch.optim.lr_scheduler._LRScheduler, optional) – An existing object of a learning rate scheduler class. Defaults to None.
initial_lr (float, optional) – The initial learning rate for the new optimizer object needed in case lr_scheduler_object is None. Defaults to 1e-3.
n_steps (int, optional) – The number of epochs/steps to visualize the learning rate changes. Defaults to 20.
parameter_group_number (int, optional) – The index for the optimizer’s parameter group to plot the learning rate for. Defaults to 0.
save_fig (bool, optional) – Whether to save the plot instead of showing. Defaults to False.
fig_name (str, optional) – The name of the file to save the plot to. Defaults to None (the class name of the lr_scheduler_class).
*args – Additional arguments to pass to the lr_scheduler_class when creating a new scheduler.
**kwargs – Additional keyword arguments to pass to the lr_scheduler_class when creating a new scheduler.

Example:

# Using an existing learning rate scheduler object
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
plot_lr_scheduler(lr_scheduler_object=lr_scheduler, n_steps=100)

# Creating a new learning rate scheduler object
plot_lr_scheduler(
    lr_scheduler_class=torch.optim.lr_scheduler.ExponentialLR,
    initial_lr=0.01,
    gamma=0.9,
    n_steps=50,
)

sign_language_translator.models.utils.set_layers_trainability_(model: Module, layers_to_unfreeze: List[str] | None = None, layers_to_freeze: List[str] | None = None)[source]

Set the trainability of specified layers in the given PyTorch model.

This function allows you to selectively freeze or unfreeze specific layers of a PyTorch model by setting their requires_grad attribute accordingly.

Parameters:

model (torch.nn.Module) – The PyTorch model whose layers’ requires_grad will be modified.
layers_to_unfreeze (List[str] | None, optional) – A list of layer names or prefixes for layers that you want to unfreeze. If None, no layers will be unfrozen. If [“”], all layers will be unfrozen. Default is None.
layers_to_freeze (List[str] | None, optional) – A list of layer names or prefixes for layers that you want to freeze. If None, no layers will be frozen. If [“”], all layers will be frozen. Default is None.

Returns:

This function modifies the model in-place. It does not return anything.

Return type:

None

Note

If both layers_to_unfreeze and layers_to_freeze are None or empty, no action will be taken, and the function will return immediately.
The layers’ names or prefixes specified in the lists should match the names as returned by model.named_parameters().

Examples:

# To freeze all layers in the model:
set_layers_trainability_(model, layers_to_freeze=[""])

# To unfreeze the layers with names starting with 'classifier' and 'fc':
set_layers_trainability_(model, layers_to_unfreeze=["classifier", "fc"])

# To unfreeze all layers:
set_layers_trainability_(model, layers_to_unfreeze=[""])

sign_language_translator.models.utils.top_p_top_k_indexes(probabilities: Iterable[float], top_p: float | None = None, top_k: int | None = None) → List[int][source]

Perform top-p (nucleus) and top-k filtering based on the given probabilities. Top-k returns the indices of the top-k elements. Top-p returns the indices of the top elements whose sum does not exceed a certain value.

Parameters:

probs (Iterable[float]) – A 1-D iterable containing the probabilities of each element. Probabilities must sum to 1.
top_p (float or None) – The threshold probability for top-p sampling (0 to 1). If None, top-p sampling is not applied.
top_k (int or None) – The maximum number of elements to keep for top-k sampling. If None, top-k sampling is not applied.

Returns:

The indices of the selected elements.

Return type:

torch.Tensor

Examples:

selected_indices = top_p_top_k_indexes(
    probabilities=[0.1, 0.2, 0.15, 0.05, 0.3, 0.2],
    top_p=0.75,
    top_k=3,
)
# [4, 1, 5]

# You can then use the `selected_indices` to gather
# the actual elements from the original tensor.
sampled_elements = probs[selected_indices]
# [0.3, 0.2, 0.2]