sign_language_translator.utils package

Submodules

Module contents

Utils

This module provides utility functions for the sign language translator package.

Functions

  • download: A function for downloading files from urls.

  • tree: A function for printing a directory tree.

  • sample_one_index: Select an index based on the given probability distribution.

  • search_in_values_to_retrieve_key: search inside every dict value and return the key on match.

  • in_jupyter_notebook: Checks if the code is running in a Jupyter notebook.

  • is_regex: Checks if the given string is a regex or a regular string.

  • linear_interpolation: figure out intermediate values inside an array.

  • threaded_map: Multi-threaded mapping of a function to an iterable.

  • extract_recursive: Recursively extracts values associated with a specified key from a nested dictionary.

Classes

  • ArrayOps: A class for array operations agnostic to numpy.ndarray and torch.Tensor.

  • Archive: A utility class for making, viewing and extracting archive files such as .zip files.

  • PrintableEnumMeta: A metaclass for making enum classes printable with the class members.

  • ProgressStatusCallback: A class for updating a tqdm progress bar inside a function.

class sign_language_translator.utils.Archive[source]

Bases: object

This utility class provides static methods for creating, listing, and extracting files from ZIP archives.

Methods: - create(filename_or_patterns: str | List[str], archive_path: str, compression=zipfile.ZIP_DEFLATED,

progress_bar=True, overwrite=False)

Create a ZIP archive from files matching the specified pattern.

  • list(archive_path: str, pattern=”*”, regex: str = r”.*”) -> List[str]

    List the files in a ZIP archive, optionally filtered by a glob pattern or regex.

  • extract(archive_path: str, pattern: str = “*”, regex: str | re.Pattern = r”.*”, output_dir: str = “.”,

    overwrite=False, progress_bar=True, leave=True, password: bytes = None, verbose=True) -> List[str]

    Extract files from a ZIP archive to the specified output directory, optionally filtered by file names, patterns, or regex.

Example:

from sign_language_translator.utils import Archive

# Create a ZIP archive with files matching a pattern
Archive.create("*.txt", "output_archive.zip", overwrite=True)

# List files in a ZIP archive using a pattern and a regular expression
files = Archive.list("input_archive.zip", pattern="file_*.txt", regex=r"file_\d\.txt")
print(files)

# Extract files from a ZIP archive to a specified directory
extracted_files = Archive.extract("input_archive.zip", pattern="*.txt", output_dir="output_dir", overwrite=True)
print(extracted_files)

Note

  • For file patterns, this class uses glob-style patterns e.g. “*.mp4”.

  • When extracting files, warnings are issued for skipped files with the same base name.

static create(filename_or_patterns: str | List[str], archive_path: str, compression=8, progress_bar=True, overwrite=False)[source]

Create a zip archive from files matching the given pattern.

Parameters:
  • filename_or_patterns (str | List[str]) – Files or Unix shell-style patterns matching the files to include in the archive.

  • archive_path (str) – Path to the output zip archive.

  • compression (int, optional) – Compression method (default is zipfile.ZIP_DEFLATED).

  • progress_bar (bool, optional) – Show a progress bar during creation (default is True).

  • overwrite (bool, optional) – Overwrite existing archive (default is False).

Raises:

FileExistsError – If the archive_path already exists and overwrite is False.

static extract(archive_path: str, pattern: str = '*', regex: str | Pattern = '.*', output_dir: str = '.', overwrite=False, progress_bar=True, leave=True, password: bytes | None = None, verbose=True) List[str][source]

Extract specified files from a zip archive. Only those files are extracted that match the regex AND the pattern.

Parameters:
  • archive_path (str) – Path to the zip archive.

  • pattern (str) – Unix shell-style wildcard pattern that specifies the files to extract (default is “*”).

  • regex (str | re.Pattern) – Regular expression pattern that specifies the files to extract (default is “.*”).

  • output_dir (str, optional) – Directory to extract files into (Default is “.”).

  • overwrite (bool, optional) – Overwrite existing files during extraction (default is False).

  • progress_bar (bool, optional) – Show a progress bar during extraction (default is True).

  • leave (bool, optional) – Leave progress bar displayed upon completion (default is True).

  • password (bytes, optional) – Password for encrypted archives (default is None).

  • verbose (bool, optional) – Raise warnings for skipped existing files (default is True).

Returns:

List of paths to the extracted files and the already extracted matching files.

Return type:

List[str]

static list(archive_path: str, pattern: str = '*', regex: str | Pattern = '.*') List[str][source]

List files in the zip archive filtered by the specified pattern or regex.

Parameters:
  • archive_path (str) – Path to the zip archive.

  • pattern (str) – Unix shell-style wildcard pattern to filter the contents (default is “*”).

  • regex (str | re.Pattern) – Regular expression pattern to filter the contents (default is “.*”).

Returns:

List of file names in the archive that match the criteria.

Return type:

List[str]

class sign_language_translator.utils.ArrayOps[source]

Bases: object

static abs(x: ndarray[Any, dtype[_ScalarType_co]]) ndarray[Any, dtype[_ScalarType_co]][source]
static abs(x: Tensor) Tensor

Compute the absolute value of a given array or tensor.

Parameters:

x (Union[NDArray, Tensor]) – The input array or tensor.

Returns:

The absolute value of the input array or tensor.

Return type:

Union[NDArray, Tensor]

Raises:

TypeError – If the input type is not supported.

static cast(x: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[float | int], data_type: Type[ndarray | Tensor], _dtype: Type[dtype] | Type[dtype] | Type | None = None) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]

Typecast some multidimensional data to numpy array or torch Tensor.

Parameters:
  • x (Union[NDArray, Tensor, Sequence[Union[float, int]]]) – The input array or tensor.

  • data_type (Type[Union[np.ndarray, Tensor]]) – The data type to cast the input array or tensor to.

  • _dtype (Optional[Union[Type[torch.dtype], Type[np.dtype], Type]], optional) – The new data type of the values inside the array. None means original dtype is kept. Defaults to None.

Raises:

ValueError – If the data_type is not np.ndarray or Tensor.

Returns:

The casted array or tensor.

Return type:

Union[NDArray, Tensor]

static ceil(array: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[float | int] | float | int) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]
static concatenate(arrays: Sequence[ndarray[Any, dtype[_ScalarType_co]]], dim: int = 0) ndarray[Any, dtype[_ScalarType_co]][source]
static concatenate(arrays: Sequence[Tensor], dim: int = 0) Tensor

Concatenate a sequence of arrays or tensors along a specified dimension.

Parameters:
  • arrays (Union[Sequence[NDArray], Sequence[Tensor]]) – The sequence of arrays or tensors to concatenate.

  • dim (int, optional) – The dimension along which to concatenate the arrays or tensors. Default is 0.

Returns:

The concatenated array or tensor.

Return type:

Union[NDArray, Tensor]

Raises:

TypeError – If the input type is not supported.

static copy(x: ndarray[Any, dtype[_ScalarType_co]]) ndarray[Any, dtype[_ScalarType_co]][source]
static copy(x: Tensor) Tensor

Create a copy of a given array or tensor.

Parameters:

x (Union[NDArray, Tensor]) – The input array or tensor.

Returns:

A deep copy of the input array or tensor.

Return type:

Union[NDArray, Tensor]

Raises:

TypeError – If the input type is not supported.

static floor(array: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[float | int] | float | int) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]
static linspace(start: float | int, end: float | int, n_steps: int, data_type: ~typing.Type[~numpy.ndarray | ~torch.Tensor] = <class 'numpy.ndarray'>, endpoint=True) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]

Generate an array or tensor with equally spaced values between start and end.

Parameters:
  • start (Union[float, int]) – The starting value of the sequence. The value is inclusive.

  • end (Union[float, int]) – The end value of the sequence. The value is inclusive if endpoint is True.

  • n_steps (int) – The number of samples to generate. Must be non-negative.

  • data_type (Type[Union[np.ndarray, Tensor]], optional) – The data type of the output array. Defaults to np.ndarray.

  • endpoint (bool, optional) – Whether to include the end value in the sequence. Defaults to True.

Raises:

ValueError – If data_type is not np.ndarray or Tensor.

Returns:

The generated array or tensor.

Return type:

Union[NDArray, Tensor]

static norm(x: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[float | int], dim: int | None = None, keepdim=False) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]

Compute the norm of a given array or tensor along a specified dimension.

Parameters:
  • x (Union[NDArray, Tensor, Sequence[Union[float, int]]]) – The input array or tensor.

  • dim (Optional[int]) – The dimension along which to compute the norm. If None, the norm is computed over the entire array or tensor. Default is None.

  • keepdim (bool) – Whether to keep the dimension of the input array or tensor after computing the norm. Default is False.

Returns:

The norm of the input array or tensor.

Return type:

Union[NDArray, Tensor]

Raises:

TypeError – If the input type is not supported.

static random_normal(size: ~typing.Sequence[int], loc: float | int = 0, scale: float | int = 1, start: float | int = -inf, end: float | int = inf, data_type: ~typing.Type[~numpy.ndarray | ~torch.Tensor] = <class 'numpy.ndarray'>) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]

Generate an array or tensor of the specified shape filled with random values from a normal (Gaussian) distribution. Optionally truncate the distribution to the range [start, end].

Parameters:
  • size (Sequence[int]) – The shape of the output array or tensor.

  • loc (Union[float, int], optional) – The mean (“centre”) of the distribution. Defaults to 0.

  • scale (Union[float, int], optional) – The standard deviation (spread or “width”) of the distribution. Must be non-negative. Defaults to 1.

  • start (Union[float, int], optional) – The lower bound of the distribution. Defaults to float(“-inf”).

  • end (Union[float, int], optional) – The upper bound of the distribution. Defaults to float(“inf”).

  • data_type (Type[Union[np.ndarray, Tensor]], optional) – The data type of the output array. Defaults to np.ndarray.

Raises:

ValueError – If data_type is not np.ndarray or torch.Tensor.

Returns:

The random values filled array or tensor.

Return type:

Union[NDArray, Tensor]

Note

Uses torch’s random number generator to generate random values even for NumPy arrays.

static random_uniform(size: ~typing.Sequence[int], start: float | int = 0, end: float | int = 1, data_type: ~typing.Type[~numpy.ndarray | ~torch.Tensor] = <class 'numpy.ndarray'>) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]

Generate a random array of the specified size with values uniformly distributed between [start, end).

Parameters:
  • size (Sequence[int]) – The shape of the output array or tensor.

  • start (Union[float, int], optional) – The lower bound of the uniform distribution. The value is inclusive. Defaults to 0.

  • end (Union[float, int], optional) – The upper bound of the uniform distribution. The value is exclusive. Defaults to 1.

  • data_type (Type[Union[np.ndarray, Tensor]], optional) – The data type of the output array. Defaults to np.ndarray.

Raises:

ValueError – If data_type is not np.ndarray or Tensor.

Returns:

The array or tensor filled with random values.

Return type:

Union[NDArray, Tensor]

Note

Uses torch’s random number generator to generate random values even for NumPy arrays.

static steps(n_steps: int, anchors: Tensor = torch.Tensor([0, -1, 2]), random_uniform_frac: float = 0.2, random_normal_frac: float = 0.3, n_clusters: int = 1, cluster_std: float | None = None, anchor_spacing_blend: float = 0.5) Tensor[source]
static steps(n_steps: int, anchors: ndarray[Any, dtype[_ScalarType_co]] | Sequence[float | int] = (0, 1), random_uniform_frac: float = 0.2, random_normal_frac: float = 0.3, n_clusters: int = 1, cluster_std: float | None = None, anchor_spacing_blend: float = 0.5) ndarray[Any, dtype[_ScalarType_co]]

Generates a sequence of steps based on a combination of linear interpolation, random uniform distribution, and random normal distribution.

Parameters:
  • n_steps (int) – The total number of steps to generate.

  • anchors (Union[NDArray, Tensor, Sequence[Union[float, int]]], optional) – The points between & through which the steps are interpolated. Defaults to (0, 1).

  • random_uniform_frac (float, optional) – The fraction of steps generated using a random uniform distribution. Must be between 0 and 1. Defaults to 0.2.

  • random_normal_frac (float, optional) – The fraction of steps generated using a random normal distribution. Must be between 0 and 1. Defaults to 0.3.

  • n_clusters (int, optional) – The number of concentrated spots to add using the random normal (gaussian distribution) steps (around cluster centroids selected from a uniform distribution). Defaults to 1.

  • cluster_std (Optional[float], optional) – The standard deviation (spread) of the normal distribution generating the concentrated spots. If None, it is calculated based on the anchor gap and number of clusters (std(gaps)/10/n_clusters). Defaults to None.

  • anchor_spacing_blend (float, optional) – A blend factor between equal anchor spacing (1) and spacing based on the distances between anchor points (0). Defaults to 0.5.

Raises:

ValueError – If the sum of random_uniform_frac and random_normal_frac exceeds 1, or if either is negative.

Returns:

The generated sequence of steps.

Return type:

Union[NDArray, Tensor]

Examples:

import torch
from sign_language_translator.utils import ArrayOps

# you should plot the following arrays on a graph for better understanding
anchors = [0, 1, -2, 0, 5, 2]

# Basic linear interpolation with no randomness and equal anchor spacing
steps = ArrayOps.steps(9, anchors, 0, 0, 0, 0, anchor_spacing_blend=0)
# array([ 0.  ,  0.25, -1.5 , -0.75,  1.  ,  2.75,  4.5 ,  3.75,  2.  ])

# Linear interpolation with no randomness and anchor spacing based on distances
steps = ArrayOps.steps(9, anchors, 0, 0, 0, 0, anchor_spacing_blend=1)
# array([ 0.   ,  0.625,  0.25 , -1.625, -1.   ,  0.625,  3.75 ,  3.875,  2.   ])

# A blend of equal and distance-based anchor spacing with no randomness
steps = ArrayOps.steps(9, torch.Tensor(anchors), 0, 0, 0, 0, anchor_spacing_blend=0.5)
# Tensor([ 0.   ,  0.921, -0.655, -1.625, -0.167,  1.987,  4.231,  3.81 ,  2.   ])

# Adding uniform randomness to the steps
steps = ArrayOps.steps(9, anchors, 0.5, 0, 0, 0, anchor_spacing_blend=1)
# array([ 0.   ,  0.25 , -1.   , -0.895,  0.214,  1.346,  3.75 ,  3.777,  2.   ])

# Adding 2 concentration spots using gaussian randomness
steps = ArrayOps.steps(9, anchors, 0, 0.5, 2, 0.1, anchor_spacing_blend=0)
# array([ 0.   ,  0.99 ,  0.924, -1.5  ,  1.   ,  4.5  ,  4.872,  4.025,  2.   ])

# Combining uniform and normal randomness
steps = ArrayOps.steps(9, anchors, 0.2, 0.3, 2, 0.1, anchor_spacing_blend=0.5)
# array([ 0.   ,  0.069, -1.333,  0.468,  1.538,  3.835,  4.897,  4.267,  2.   ])
static svd(x: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[Sequence[float | int]]) Tuple[ndarray[Any, dtype[_ScalarType_co]] | Tensor, ndarray[Any, dtype[_ScalarType_co]] | Tensor, ndarray[Any, dtype[_ScalarType_co]] | Tensor][source]

Compute the singular value decomposition of a given array or tensor.

Parameters:

x (Union[NDArray, Tensor, Sequence[Sequence[Union[float, int]]]]) – The input array or tensor.

Returns:

The (Rotation, coordinate scaling, reflection) matrices of the input array or tensor.

Return type:

Tuple[Union[NDArray, Tensor], Union[NDArray, Tensor], Union[NDArray, Tensor]]

Raises:

TypeError – If the input type is not supported.

static take(array: ndarray[Any, dtype[_ScalarType_co]] | Tensor | List, index: ndarray[Any, dtype[_ScalarType_co]] | Tensor | List | int, dim: int = 0) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]
static top_k(x: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[float | int], k: int, dim: int = -1, largest=True) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]]] | Tuple[Tensor, Tensor][source]

Compute the top k values and their indices along a specified dimension of a given array or tensor.

Parameters:
  • x (Union[NDArray, Tensor, Sequence[Union[float, int]]]) – The input array or tensor.

  • k (int) – The number of top values to return.

  • dim (int, optional) – The dimension along which to compute the top k values. Default is -1.

  • largest (bool, optional) – Whether to return the largest or smallest k values. Default is True.

Returns:

The top k values and their indices along the specified dimension.

Return type:

Tuple[Union[NDArray, Tensor], Union[NDArray, Tensor]]

Raises:

TypeError – If the input type is not supported.

class sign_language_translator.utils.PrintableEnumMeta(cls, bases, classdict, **kwds)[source]

Bases: EnumMeta

Metaclass for customizing the string representation of Enum classes.

This metaclass overrides the __str__ & __repr__ method to provide a human-readable representation of Enum classes when they are printed. The generated string includes the class name and a formatted list of Enum members and their values.

Example:

class MyEnumClass(enum.Enum, metaclass=PrintableEnumMeta):
    MEMBER1 = "value_1"
    MEMBER2 = "value_2"

print(MyEnumClass)

# "MyEnumClass" enum class. Available values:
# 1. MEMBER1 = value_1
# 2. MEMBER2 = value_2
class sign_language_translator.utils.ProgressStatusCallback(tqdm_bar: tqdm_asyncio)[source]

Bases: object

A callback class to update a tqdm progress bar with custom status information.

Parameters:

tqdm_bar (tqdm) – The tqdm progress bar to be updated.

tqdm_bar

The tqdm progress bar associated with the callback.

Type:

tqdm

__call__(self, status

Dict[str, Any]): Update the tqdm progress bar with the provided status information.

Example:

# Instantiate a tqdm progress bar & callback
progress_bar = tqdm(total=100, desc='Processing')
callback = ProgressStatusCallback(tqdm_bar=progress_bar)

# Update the progress bar inside some other function
status_info = {'Epoch': 1, 'Loss': 0.123, 'Accuracy': 0.95}
callback(status_info)
sign_language_translator.utils.adjust_vector_angle(vector_1: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[float], vector_2: ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[float], scaling_factor: float, post_normalize: bool = False) Tuple[ndarray[Any, dtype[_ScalarType_co]] | Tensor, ndarray[Any, dtype[_ScalarType_co]] | Tensor][source]

Move a pair of vectors away or towards each other in the same plane.

Converge or Diverge a pair of vectors by increasing or decreasing their distance from each other. The norm or the length of the vectors is preserved.

Parameters:
  • vector_1 (NDArray | Tensor) – A 1D array of size n representing a word in an n dimensional vector space.

  • vector_2 (NDArray | Tensor) – A 1D array of size n representing another word in an n dimensional vector space.

  • scaling_factor (float) – The scaling factor by which the vector difference should be enhanced or diminished. The fraction of distance between the vectors where new vector should land. (sf > 1 diverges the two vectors. sf = 1 leaves the two vectors unchanged. 0.5 < sf < 1 converges the two vectors. sf = 0.5 makes the two vectors equal to their mean. sf = 0 swaps the two vectors. sf < 0.5 move the vectors away from their mean but in opposite direction.)

  • post_normalize (bool, optional) – Make the magnitude of both output vectors equal to 1 after they have been rotated. Defaults to False.

Returns:

moved vectors.

Return type:

Tuple[NDArray | Tensor, NDArray | Tensor]

Notes:

# sf > 1 diverges the two vectors
# new_v1 = v2 + 2.00 * (v1 - v2) = 2 * v1 - v2     # more v1, less v2.
# new_v2 = v1 - 2.00 * (v1 - v2) = 2 * v2 - v1     # more v2, less v1.

# sf = 1 leaves the two vectors unchanged
# new_v1 = v2 + 1.00 * (v1 - v2) = v1
# new_v2 = v1 - 1.00 * (v1 - v2) = v2

# 0.5 < sf < 1 converges the two vectors
# new_v1 = v2 + 0.75 * (v1 - v2) = 0.75 * v1 + 0.25 * v2    # weighted average
# new_v1 = v1 - 0.75 * (v1 - v2) = 0.75 * v2 + 0.25 * v1    # weighted average

# sf = 0.5 makes the two vectors equal
# new_v1 = v2 + 0.50 * (v1 - v2) = 0.5 * v1 + 0.5 * v2   # mean
# new_v1 = v1 - 0.50 * (v1 - v2) = 0.5 * v2 + 0.5 * v1   # mean

# sf = 0. swaps the two vectors
# new_v1 = v2 + 0.00 * (v1 - v2) = v2
# new_v2 = v1 + 0.00 * (v1 - v2) = v1

# sf < 0.5 move the vectors away from their mean but in opposite direction
# new_v1 = v2 + (-1) * (v1 - v2) = 2 * v2 - v1    # more v2, less v1.
# new_v2 = v1 - (-1) * (v1 - v2) = 2 * v1 - v2    # more v1, less v2.
sign_language_translator.utils.align_vectors(source_matrix: ndarray[Any, dtype[_ScalarType_co]] | Tensor, target_matrix: ndarray[Any, dtype[_ScalarType_co]] | Tensor, pre_normalize: bool = True) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]

Align the source matrix to the target matrix using the orthogonal transformation.

Parameters:
  • source_matrix (NDArray | Tensor) – A 2D array of shape (dictionary_length, embedding_dimension) containing word vectors from source model (or language).

  • target_matrix (NDArray | Tensor) – A 2D array of shape (dictionary_length, embedding_dimension) containing word vectors from target model (or language).

  • normalize_vectors (bool, optional) – Whether to normalize the training vectors before SVD. Defaults to True.

Returns:

An orthogonal transformation which aligns the source language to the target language.

Return type:

NDArray | Tensor

Note

This function supports both NumPy arrays and PyTorch tensors as input. (Based on: https://github.com/babylonhealth/fastText_multilingual)

sign_language_translator.utils.download(file_path: str, url: str, overwrite=False, progress_bar=False, timeout: float = 20.0, leave=True, chunk_size=65536, status_callback=None) bool[source]

Downloads a file from the specified URL and saves it to the given file path.

Parameters:
  • file_path (str) – The path where the downloaded file will be saved.

  • url (str) – The URL of the file to be downloaded.

  • overwrite (bool, optional) – If False, skips downloading if the file already exists. Defaults to False.

  • progress_bar (bool, optional) – If True, displays a progress bar during the download. Defaults to False.

  • timeout (int, optional) – The maximum number of seconds to wait for a server response. Defaults to 20.0.

  • leave (bool, optional) – Wether to leave the progress bar behind after the download. Defaults to True.

  • chunk_size (int, optional) – The number of bytes to fetch in each step. Defaults to 65536.

Returns:

True if the file is downloaded successfully, False otherwise.

Return type:

bool

Raises:

FileExistsError – if overwrite is False and the destination path already contains a file.

sign_language_translator.utils.extract_recursive(data: Dict[str, Any], key: str) List[Any][source]

Recursively extracts values associated with a specified key from a nested dictionary.

Parameters:
  • data (Dict[str, Any]) – The input dictionary containing nested structures.

  • key (str) – The key for which values need to be extracted.

Returns:

A list containing all values associated with the specified key, extracted

recursively from the input dictionary.

Return type:

List[Any]

Examples:

data = {'a': 1, 'b': {'c': 2, 'd': {'e': 3, 'f': 4}}, 'g': [5, {'h': 6, 'e': 7}]}
extract_recursive(data, 'e')
# [3, 7]
extract_recursive(data, 'h')
# [6]
extract_recursive(data, 'x')
# []  # Key not found, returns an empty list.
sign_language_translator.utils.in_jupyter_notebook()[source]

Checks if the code is running in a Jupyter notebook.

Returns:

True if running in a Jupyter notebook, False otherwise.

Return type:

bool

sign_language_translator.utils.is_internet_available() bool[source]

Hit a well-known server (Google DNS) to check for internet availability.

Returns:

True if internet is available, False otherwise.

Return type:

bool

sign_language_translator.utils.is_regex(string: str | Pattern) bool[source]

Tests whether the argument is a regex or a regular string.

Parameters:

string (str | Pattern) – The string to be tested.

Returns:

whether the argument is a regex (True) or a regular string (False).

Return type:

bool

sign_language_translator.utils.linear_interpolation(array: ndarray[Any, dtype[number]] | Tensor | Sequence, new_x: Sequence[float | int] | ndarray[Any, dtype[number]] | Tensor, old_x: Sequence[float | int] | ndarray[Any, dtype[number]] | Tensor | None = None, dim: int = 0) ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]

Perform linear interpolation on a multidimensional array or tensor along a dimension.

This function essentially connects all consecutive values in a multidimensional array with straight lines along a specified dimension, so that intermediate values can be calculated. It takes the input array, a set of new indexes or alternatively new & old coordinate values, and a dimension along which to perform interpolation.

Parameters:
  • array (NDArray[np.number] | Tensor | List) – The input array or tensor to interpolate.

  • new_x (Sequence[int | float] | NDArray[np.number] | Tensor) – The new index values or coordinate values at which to calculate the intermediate values from array. Must be 1D. Order of values does not matter. if old_x is not provided, these values are relative to the index of the data in array i.e. [0, 1, 2, …] and negative indexes are allowed. If old_x is provided, all new_x values must be within it’s bounds.

  • old_x (Sequence[int | float] | NDArray[np.number] | Tensor | None, optional) – The old coordinate values corresponding to the data in array along the dim. Must be 1D and strictly sorted ascending. Can contain negative numbers. If None, method assumes it to be a linear sequence starting at 0 and growing with step +1 i.e. [0, 1, 2, …] like the index of array.

  • dim (int, optional) – The dimension along which to perform interpolation. Default is 0.

Returns:

The result of linear interpolation along the specified dimension.

Return type:

NDArray | Tensor

Raises:

ValueError – If new_x or old_x is not 1 dimensional.

Examples

data = np.array([1, 2, 3, 5])
new_indexes = np.array([1.5, 0.5, 2.5])
interpolated_data = linear_interpolation(data, new_indexes)
print(interpolated_data)
# array([2.5, 1.5, 4. ])

old_x = [0, 4, 4.5, 5]
new_x = [0, 1, 2, 2.5, 3, 4, 5]
interpolated_data = linear_interpolation(data, new_x, old_x=old_x)
print(interpolated_data)
# array([1.   , 1.25 , 1.5  , 1.625, 1.75 , 2.   , 5.   ])

positional_embedding_table = torch.randn(100, 768)  # (max_seq_len, embedding_dim)
intermediate_positions = torch.linspace(0, 99, 500)
new_embedding_table = linear_interpolation(positional_embedding_table, intermediate_positions, dim=0)
# new_embedding_table.shape -> (500, 768) # (new_max_seq_len, embedding_dim)

Note

This function supports both NumPy arrays and PyTorch tensors as input and preserves gradient.

sign_language_translator.utils.sample_one_index(weights: List[float], temperature: float = 1.0) int[source]

Select an item based on the given probability distribution. Returns the index of the selected item sampled from weighted random distribution.

Parameters:
  • weights (List[float]) – the relative weights corresponding to each index.

  • temperature (float) – The temperature value for controlling the sampling behavior. High temperature means sampling probabilities are more uniform (says random things). Low temperature means that sampling probabilities are higher for bigger weights. Defaults to 1.0.

Returns:

The index of the chosen item.

Return type:

int

sign_language_translator.utils.search_in_values_to_retrieve_key(code_name: str, class_to_codes: Dict[Any, Set[str]])[source]
sign_language_translator.utils.threaded_map(target: Callable, args_list: Iterable, time_delay=0.02, timeout: float | None = None, max_n_threads: int | None = None, progress_bar=True, leave=True)[source]

Multi-threaded mapping of a function to an iterable. Useful for I/O bound tasks.

This function allows you to apply a target function to elements in an iterable concurrently using multiple threads. You can control the number of threads, introduce time delays between thread launches, and enable a progress bar.

Parameters:
  • target (Callable) – The function to apply to each element in the iterable.

  • args_list (Iterable) – An iterable of arguments to be passed to the target function in parallel.

  • time_delay (float, optional) – Time delay (in seconds) between launching threads. Default is 0.02.

  • timeout (float, optional) – The maximum amount of time (in seconds) to wait for a thread to finish. Default is None, which means wait indefinitely.

  • max_n_threads (int, optional) – The maximum number of threads to run concurrently. Default is None, which means entire args_list will be processed concurrently.

  • progress_bar (bool, optional) – Enable or disable the progress bar. Default is True.

  • leave (bool, optional) – Whether to leave the progress bar after completion. Default is True.

Example

import requests
from sign_language_translator.utils import threaded_map

def get_webpage(url, results: dict):
    if url not in results:
        results[url] = requests.get(url)

urls = ["https://example.com", "https://github.com", ...]
results = {}
args = [(url, results) for url in urls]

# process urls concurrently, with a maximum of 2 threads at a time
threaded_map(get_webpage, args, max_n_threads=2)
sign_language_translator.utils.tree(cur_path: str = '.', directory_only=True, extra_line=True, ignore=['__pycache__', 'temp', '__init__.py'], regex=True) None[source]

prints out directory hierarchy

Parameters:
  • cur_path (str, optional) – the root node of tree or the starting parent directory. Defaults to “.”.

  • directory_only (bool, optional) – True means files will not be listed, only folders. Defaults to True.

sign_language_translator.utils.validate_path_exists(path: str, overwrite: bool = False) str[source]

Validates the existence of a given file path and optionally creates necessary directories.

This function checks if a file already exists at the specified path. If the file exists and overwrite is set to False, a FileExistsError is raised. If overwrite is set to True, or if the file does not exist, the function returns the absolute path after ensuring that all necessary directories are created.

Parameters:
  • path (str) – The file path to be validated.

  • overwrite (bool, optional) – Whether to overwrite the file if it already exists. Defaults to False.

Raises:

FileExistsError – If the file already exists at the specified path and overwrite is set to False.

Returns:

The absolute path of the validated file.

Return type:

str

Examples

>>> validate_path_exists('/path/to/file.txt', overwrite=False)
'/absolute/path/to/file.txt'