sign_language_translator.models.utils module

Utility functions and classes for PyTorch models.

This module contains various utility functions and classes to assist with PyTorch models and their training.

Functions:
top_p_top_k_indexes(probabilities, top_p=None, top_k=None):

Perform top-p (nucleus) and top-k filtering based on the given probabilities.

plot_lr_scheduler(*args, lr_scheduler_class=None, lr_scheduler_object=None, initial_lr=1e-3, n_steps=20, parameter_group_number=0, **kwargs):

Plot the learning rate of a specific parameter group across training steps.

downwards_wave(n_waves, n_steps_per_wave=9, start=1e-3, end=1e-7, amplitude=0.25):

Generate a downwards wave pattern with a combination of sine wave and linear function.

set_layers_trainability_(model, layers_to_unfreeze=None, layers_to_freeze=None):

Set the trainability of specified layers in the given PyTorch model.

Classes:
FullyLambdaLR(torch.optim.lr_scheduler.LRScheduler):

Sets the learning rate of each parameter group to a given function that takes step_num, base_lr & last_lr as args.

VideoEmbeddingPipeline(slt.models.VideoEmbeddingModel):

With optional multiprocessing, reads video files from paths, performs forward pass of a model on them and saves the output in specified format.

class sign_language_translator.models.utils.FullyLambdaLR(optimizer, lr_lambda: Callable[[int, float, float], float], last_epoch=-1, verbose=False)[source]

Bases: LRScheduler

Sets the learning rate of each parameter group to a given function that takes step_num, base_lr, and last_lr as parameters. When last_epoch=-1, sets initial lr as lr.

Parameters:
  • optimizer (Optimizer) – Wrapped optimizer.

  • lr_lambda (function) – A function which computes the learning rate given an integer parameter step_num, initial learning rate and previous learning rate for each group in optimizer.param_groups.

  • last_epoch (int) – The index of last epoch. Default: -1.

  • verbose (bool) – If True, prints a message to stdout for each update. Default: False.

Example:

scheduler = FullyLambdaLR(
    optimizer,
    lambda step_num, base_lr, last_lr: last_lr * (1.08 if step_num%2 == 0 else 0.8)
)
for epoch in range(100):
    train(...)
    validate(...)
    scheduler.step()
    print(scheduler.get_last_lr()[0])
get_lr()[source]
class sign_language_translator.models.utils.VideoEmbeddingPipeline(model: VideoEmbeddingModel)[source]

Bases: object

A class for processing and embedding video data using a slt.models.VideoEmbeddingModel.

Parameters:

model (VideoEmbeddingModel) – An instance of the VideoEmbeddingModel class or its child class used for embedding.

process_video(path, save_format='csv', overwrite=False, output_dir='.', **kwargs)[source]

Load, embed, and save a video’s embedding. kwargs are passed to model.embed().

process_videos_parallel(path_patterns, n_processes=multiprocessing.cpu_count(),

save_format=”csv”, overwrite=False, output_dir=”.”, **kwargs):

Process multiple videos in parallel using multiprocessing. kwargs are passed to model.embed().

model

The VideoEmbeddingModel instance used for embedding.

Type:

VideoEmbeddingModel

process_video(path, save_format='csv', overwrite=False, output_dir='.', **kwargs)[source]

Load a video, embed its frames, and save the embedding.

Parameters:
  • path (str) – The path to the video file.

  • save_format (str, optional) – Format for saving the embedding (“csv”, “torch”, “npy”, “npz”).

  • overwrite (bool, optional) – Whether to overwrite existing embedding files.

  • output_dir (str, optional) – Directory to save the embedding file.

  • **kwargs – Additional keyword arguments for embedding model.

Returns:

None

process_videos_parallel(path_patterns: List[str], n_processes=2, save_format='csv', overwrite=False, output_dir='.', **kwargs)[source]

Process multiple videos in parallel using multiprocessing, embedding their frames.

Parameters:
  • path_patterns (list) – List of file path patterns to match videos e.g. [“dataset/.mp4”, “dataset/.avi”].

  • n_processes (int, optional) – Number of parallel processes. Defaults to multiprocessing.cpu_count().

  • save_format (str, optional) – Format for saving the embeddings (“csv”, “torch”, “npy”, “npz”).

  • overwrite (bool, optional) – Whether to overwrite existing embedding files.

  • output_dir (str, optional) – Directory to save the embedding files.

  • **kwargs – Additional keyword arguments for embedding model.

Returns:

None

sign_language_translator.models.utils.downwards_wave(n_waves: int, n_steps_per_wave: int = 9, start: float = 0.001, end: float = 1e-07, amplitude: float = 0.25) ndarray[source]

Generate a downwards wave pattern with a combination of sine wave and linear function.

The function generates a sequence of points forming a downward wave pattern, which consists of a combination of a sine wave and a linear function. The sine wave is modulated by the linear function to create a gradual decrease in axis of the waves.

Parameters:
  • n_waves (int) – Number of peaks/cycles to generate.

  • n_steps_per_wave (int, optional) – Number of steps per wave. Default is 9.

  • start (float, optional) – Starting value for the linear function. Default is 1e-3.

  • end (float, optional) – Ending value for the linear function. Default is 1e-7.

  • amplitude (float, optional) – Amplitude of the sine wave. Default is 0.25.

Returns:

Array containing the y-axis values of downwards wave pattern.

Return type:

numpy.ndarray

sign_language_translator.models.utils.plot_lr_scheduler(*args, lr_scheduler_class: Type[LRScheduler] | None = None, lr_scheduler_object: LRScheduler | None = None, initial_lr: float = 0.001, n_steps: int = 20, parameter_group_number: int = 0, save_fig: bool = False, fig_name: str | None = None, **kwargs)[source]

Plot the learning rate of a specific parameter group across training steps.

This function generates a plot to visualize how the learning rate of a specified parameter group changes across training steps. It requires either an existing lr_scheduler_object or a combination of lr_scheduler_class, args, and kwargs to create a new learning rate scheduler.

Parameters:
  • lr_scheduler_class (Type[torch.optim.lr_scheduler._LRScheduler], optional) – The class of the learning rate scheduler. Defaults to None.

  • lr_scheduler_object (torch.optim.lr_scheduler._LRScheduler, optional) – An existing object of a learning rate scheduler class. Defaults to None.

  • initial_lr (float, optional) – The initial learning rate for the new optimizer object needed in case lr_scheduler_object is None. Defaults to 1e-3.

  • n_steps (int, optional) – The number of epochs/steps to visualize the learning rate changes. Defaults to 20.

  • parameter_group_number (int, optional) – The index for the optimizer’s parameter group to plot the learning rate for. Defaults to 0.

  • save_fig (bool, optional) – Whether to save the plot instead of showing. Defaults to False.

  • fig_name (str, optional) – The name of the file to save the plot to. Defaults to None (the class name of the lr_scheduler_class).

  • *args – Additional arguments to pass to the lr_scheduler_class when creating a new scheduler.

  • **kwargs – Additional keyword arguments to pass to the lr_scheduler_class when creating a new scheduler.

Example:

# Using an existing learning rate scheduler object
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
plot_lr_scheduler(lr_scheduler_object=lr_scheduler, n_steps=100)

# Creating a new learning rate scheduler object
plot_lr_scheduler(
    lr_scheduler_class=torch.optim.lr_scheduler.ExponentialLR,
    initial_lr=0.01,
    gamma=0.9,
    n_steps=50,
)
sign_language_translator.models.utils.set_layers_trainability_(model: Module, layers_to_unfreeze: List[str] | None = None, layers_to_freeze: List[str] | None = None)[source]

Set the trainability of specified layers in the given PyTorch model.

This function allows you to selectively freeze or unfreeze specific layers of a PyTorch model by setting their requires_grad attribute accordingly.

Parameters:
  • model (torch.nn.Module) – The PyTorch model whose layers’ requires_grad will be modified.

  • layers_to_unfreeze (List[str] | None, optional) – A list of layer names or prefixes for layers that you want to unfreeze. If None, no layers will be unfrozen. If [“”], all layers will be unfrozen. Default is None.

  • layers_to_freeze (List[str] | None, optional) – A list of layer names or prefixes for layers that you want to freeze. If None, no layers will be frozen. If [“”], all layers will be frozen. Default is None.

Returns:

This function modifies the model in-place. It does not return anything.

Return type:

None

Note

  • If both layers_to_unfreeze and layers_to_freeze are None or empty, no action will be taken, and the function will return immediately.

  • The layers’ names or prefixes specified in the lists should match the names as returned by model.named_parameters().

Examples:

# To freeze all layers in the model:
set_layers_trainability_(model, layers_to_freeze=[""])

# To unfreeze the layers with names starting with 'classifier' and 'fc':
set_layers_trainability_(model, layers_to_unfreeze=["classifier", "fc"])

# To unfreeze all layers:
set_layers_trainability_(model, layers_to_unfreeze=[""])
sign_language_translator.models.utils.top_p_top_k_indexes(probabilities: Iterable[float], top_p: float | None = None, top_k: int | None = None) List[int][source]

Perform top-p (nucleus) and top-k filtering based on the given probabilities. Top-k returns the indices of the top-k elements. Top-p returns the indices of the top elements whose sum does not exceed a certain value.

Parameters:
  • probs (Iterable[float]) – A 1-D iterable containing the probabilities of each element. Probabilities must sum to 1.

  • top_p (float or None) – The threshold probability for top-p sampling (0 to 1). If None, top-p sampling is not applied.

  • top_k (int or None) – The maximum number of elements to keep for top-k sampling. If None, top-k sampling is not applied.

Returns:

The indices of the selected elements.

Return type:

torch.Tensor

Examples:

selected_indices = top_p_top_k_indexes(
    probabilities=[0.1, 0.2, 0.15, 0.05, 0.3, 0.2],
    top_p=0.75,
    top_k=3,
)
# [4, 1, 5]

# You can then use the `selected_indices` to gather
# the actual elements from the original tensor.
sampled_elements = probs[selected_indices]
# [0.3, 0.2, 0.2]