sign_language_translator.vision.landmarks package
Submodules
- sign_language_translator.vision.landmarks.connections module
- sign_language_translator.vision.landmarks.display module
- sign_language_translator.vision.landmarks.landmarks module
LandmarksLandmarks.name()Landmarks.numpy()Landmarks.torch()Landmarks.tolist()Landmarks.show()Landmarks.__getitem__()Landmarks.__iter__()Landmarks.__next__()Landmarks.dataLandmarks.n_framesLandmarks.n_landmarksLandmarks.n_featuresLandmarks.shapeLandmarks.ndimLandmarks.animationLandmarks.concatenate()Landmarks.connectionsLandmarks.copy()Landmarks.dataLandmarks.load()Landmarks.load_asset()Landmarks.n_coordinatesLandmarks.n_featuresLandmarks.n_framesLandmarks.n_landmarksLandmarks.name()Landmarks.ndimLandmarks.new_animation()Landmarks.numpy()Landmarks.save()Landmarks.save_animation()Landmarks.save_frames_grid()Landmarks.shapeLandmarks.show()Landmarks.show_frames_grid()Landmarks.tolist()Landmarks.torch()Landmarks.transform()
Module contents
- class sign_language_translator.vision.landmarks.BaseConnections[source][source]
Bases:
ABCA class containing information about the connections between landmarks generated from various models.
- class sign_language_translator.vision.landmarks.Landmarks(sign: str | ndarray[Any, dtype[_ScalarType_co]] | Tensor | Sequence[ndarray[Any, dtype[_ScalarType_co]]] | Sequence[Tensor] | Sequence[Sequence[Sequence[float | int]]], connections: BaseConnections | str | None = None, **kwargs)[source][source]
Bases:
SignA class to represent and manipulate landmarks data. Inherits from the Sign class.
- Parameters:
sign (NDArray | Tensor | str) – It can be provided as a path to a file (csv, npy, pt, pth), a NumPy array, a PyTorch tensor, or a sequence of arrays or tensors or numbers (3D: n_frames, n_landmarks, n_features).
- load(path
str, **kwargs): Class method to load landmarks data from a file and return a new Landmarks object.
- save(path
str, overwrite=False, precision=4, **kwargs): Saves the landmarks data to a file.
- concatenate(objects
Iterable[Landmarks]): Concatenates a sequence of Landmarks objects along the first dimension (time) and returns a new Landmarks object
- transform(transformation
Callable): Applies a transformation function to the landmarks data.
- data[source]
The landmarks data as a NumPy array or PyTorch tensor depending upon what it was initialized with.
- property animation: FuncAnimation[source]
Visualization of the landmarks on a 3D graph.
Note
For interactive display in a Jupyter notebook, use %matplotlib widget magic command and then run a cell with landmarks_obj.animation on last line.
- static concatenate(objects: Iterable[Landmarks]) Landmarks[source][source]
Concatenates a sequence of Landmarks objects along the time dimension (dim=0) and returns a new Landmarks object.
- property connections: BaseConnections[source]
Object defining the order in which landmarks are connected during display and other properties depending on the model used to extract the landmarks.
- Raises:
ValueError – If this property is accessed before landmarks connections have been defined.
- copy() Landmarks[source][source]
Creates a deep copy of the Landmarks object.
- Returns:
A new Landmarks object with the same data and connections.
- Return type:
- property data: ndarray[Any, dtype[_ScalarType_co]] | Tensor[source]
The landmarks data which is a 3D array or tensor of shape (n_frames, n_landmarks, n_features).
- classmethod load(path: str, **kwargs) Landmarks[source][source]
Class method to load landmarks data from a file and return a new Landmarks object. The supported file extensions are .npy & .pt with must contain 3D arrays (n_frames, n_landmarks, n_features) and .csv which must have n_frames rows and n_landmarks * n_features columns.
The header row in .csv is optional if the filename contains the name of a supported embedding model (see load_asset function for example models). The columns in the .csv are expected to be in the format: [<axis-letter><landmark-number>,…] (e.g. x0, y0, z0, x1, y1, z1, …, xn, yn, zn). Possible axis-letters: x, y, z, a-w, aa-zz, … (only the first 3 are required to be in that order).
- Parameters:
path (str) – The file path to load the data from.
- Returns:
A new Landmarks object containing the loaded data.
- Return type:
- classmethod load_asset(label: str, archive_name: str | None = None, overwrite=False, progress_bar=True, leave=True, **kwargs) Landmarks[source][source]
Class method to load a landmarks file from a one-time-auto-downloaded dataset archive and return a new Landmarks object.
- Parameters:
label (str) – The filename of the landmarks asset to load. ‘landmarks/’ is prepended to the label if it does not start with it. An example is ‘landmarks/pk-hfad-1_airport.landmarks-mediapipe.csv’) for embedding of a dictionary video. General syntax is landmarks/country-organization-number_text[_person_camera].landmarks-model.extension.
archive_name (Optional[str], optional) – The name of the archive which contains the landmarks asset. If None, the archive name is inferred from the label. An example is datasets/pk-hfad-1.landmarks-mediapipe-csv.zip. General syntax is datasets/country-organization-number[_person_camera].landmarks-model-extension.zip. Defaults to None.
overwrite (bool, optional) – Whether to overwrite the landmarks asset if it is already extracted. Defaults to False.
progress_bar (bool, optional) – Whether to display a progress bar while downloading the archive or extracting the asset. Defaults to True.
leave (bool, optional) – Whether to leave the progress bar after the operation is complete. Defaults to True.
**kwargs – Additional keyword arguments to be passed to the Landmarks constructor.
- Raises:
FileNotFoundError – If no landmarks assets are found for the given label.
- Warns:
UserWarning – If multiple landmarks assets match the given label and the only first asset is used.
- Returns:
An instance of the Landmarks class representing the dataset video embedding that matched the label.
- Return type:
Example
import sign_language_translator as slt # Load a dictionary video's landmark embedding asset landmarks = slt.Landmarks.load_asset("pk-hfad-1_airplane.landmarks-mediapipe.csv") # Load a replication video's landmarks from the built-in datasets landmarks = slt.Landmarks.load_asset("landmarks/pk-hfad-1_airplane_dm0001_front.landmarks-mediapipe.csv", archive_name="datasets/pk-hfad-1_dm0001_front.landmarks-mediapipe-csv.zip")
- new_animation(title: str | None = '{frame_number}', style: Literal['dark_background', 'default'] = 'default', azimuth: float = 20, elevation: float = 15, roll: float = 0, azimuth_delta: float = 0, elevation_delta: float = 0, roll_delta: float = 0, scatter_size: float = 2, figure_scale: float | None = 5, interval: float | int = 37, repeat_delay: float | int = 200, blit: bool = True) FuncAnimation[source][source]
Creates a new 3D animation object of the landmarks.
- Parameters:
title (Optional[str]) – The title of the animation. Can include the placeholder “{frame_number}” to display the frame number. Defaults to “{frame_number}”.
style (Literal["dark_background", "default"]) – The color theme of the animation. Defaults to “default”.
azimuth (float) – The azimuth angle (rotation around the vertical axis) of the camera view point. Defaults to 20.
elevation (float) – The elevation angle (amount of rise from the horizontal plane) of the camera view point. Defaults to 15.
roll (float) – The roll angle (rotation around the line of sight) of the camera view point. Defaults to 0.
azimuth_delta (float) – The change in azimuth angle per frame. Defaults to 0.
elevation_delta (float) – The change in elevation angle per frame. Defaults to 0.
roll_delta (float) – The change in roll angle per frame. Defaults to 0.
scatter_size (float) – The size of the scatter points. Defaults to 2.
figure_scale (Optional[float]) – The size of the figure. Defaults to 5.
interval (Union[float, int]) – The interval between frames in milliseconds. Defaults to 37.
repeat_delay (Union[float, int]) – The delay between animation replays in milliseconds. Defaults to 200.
blit (bool) – Whether to use blitting for faster updates (non-changing graphic elements are rendered once into a background image). Defaults to True.
- Returns:
The created animation.
- Return type:
FuncAnimation
- numpy(*args, **kwargs) ndarray[Any, dtype[_ScalarType_co]][source][source]
Returns the landmarks data as a numpy array. Additional arguments are passed to the numpy.array constructor.
- Returns:
The sign data as a NumPy array.
- Return type:
NDArray
Example:
import sign_language_translator as slt landmarks = slt.Landmarks([[[0,1,2], [1,2,3]]]) landmarks.numpy() # array([[[0, 1, 2], [1, 2, 3]]])
- save(path: str, overwrite=False, precision=4, **kwargs) None[source][source]
Saves the current object’s data to a file. Supported formats include .npy, .pt/.pth (which contain 3D data) and .csv which flattens each frame and puts it into a separate row. CSV files also contain a header with letters representing the coordinate axes and numbers identifying the landmark.
- Parameters:
path (str) – The file path to save the data to.
overwrite (bool, optional) – Whether to overwrite the file if it already exists. Defaults to False.
precision (int, optional) – The number of decimal places for saving floating-point values in CSV. Defaults to 4.
- Raises:
FileExistsError – If the file already exists and overwrite is False.
ValueError – If the file format is not supported.
- save_animation(path, overwrite=True, writer: str | None = None, **kwargs) None[source][source]
Save the video animation of the landmarks data to a file.
- Parameters:
path (str) – The path to save the animation file.
overwrite (bool, optional) – Whether to overwrite the file if it already exists. Defaults to True.
writer (Optional[str], optional) – The name of the matplotlib writer to use for saving the animation. Defaults to None.
**kwargs – Additional keyword arguments to be passed to the new_animation method.
- save_frames_grid(path: str, rows: int = 3, columns: int = 5, overwrite=True, **kwargs) None[source][source]
Save an image file of a grid of 3D visualizations of the landmarks data.
- Parameters:
path (str) – The path to save the image.
rows (int, optional) – The number of rows in the grid. Defaults to 3.
columns (int, optional) – The number of columns in the grid. Defaults to 5.
overwrite (bool, optional) – Whether to overwrite the file if it already exists. Defaults to True.
**kwargs – Additional keyword arguments to customize the grid passed to the slt.vision.landmarks.MatPlot3D.frames_grid function.
- property shape: Tuple[int, ...][source]
number of elements in each of the data array’s dimensions e.g. (n_frames, n_landmarks, n_features)
- show(player: Literal['jshtml', 'html5'] = 'jshtml', **kwargs) None[source][source]
Displays the landmarks data as a 3D animation in a Jupyter notebook or as a video in a separate window if run from the terminal.
- Parameters:
player (Literal['jshtml', 'html5'], optional) – The visualization tool to use for displaying the animation. Defaults to “jshtml”.
**kwargs – Additional keyword arguments to pass to the new_animation method. See its docstring for details.
- show_frames_grid(rows=3, columns=5, **kwargs)[source][source]
Displays a grid of frames equally spaced in time drawn as 3D scatter plots & lines connecting the points.
- Parameters:
rows (int) – The number of rows in the grid. Default is 3.
columns (int) – The number of columns in the grid. Default is 5.
**kwargs – Additional keyword arguments to be passed to the slt.vision.landmarks.MatPlot3D.frames_grid function.
- tolist() List[List[List[float | int]]][source][source]
Returns the landmarks data as a 3D nested list of numbers.
- Returns:
The sign data as a nested list.
- Return type:
List[List[List[Union[float, int]]]]
- torch(dtype: dtype | None = None, device: device | str | None = None) Tensor[source][source]
Returns the landmarks data as a PyTorch tensor.
- Parameters:
dtype (torch.dtype, optional) – The desired data type of the tensor. Defaults to None.
device (Union[torch.device, str], optional) – The desired device for the tensor. Defaults to None.
- Returns:
The sign data as a PyTorch tensor.
- Return type:
torch.Tensor
- class sign_language_translator.vision.landmarks.MatPlot3D[source][source]
Bases:
object- classmethod animate(frames: Sequence[Sequence[Tuple[float, float, float]]] | ndarray[Any, dtype[_ScalarType_co]], line_indexes: Sequence[Sequence[int]] | None = None, line_colors: Sequence[Tuple[float, float, float] | None] = (), line_labels: Sequence[str | None] = (), scatter_color: Tuple[float, float, float] = (0, 0, 0), scatter_size: float = 2, title: str | None = '{frame_number}', vertical_axis: Literal['x', 'y', 'z'] = 'z', ticks_scale: float | None = None, azimuth: float = 20, elevation: float = 15, roll: float = 0, azimuth_delta: float = 0, elevation_delta: float = 0, roll_delta: float = 0, invert_x: bool = False, invert_y: bool = False, invert_z: bool = False, show_grid: bool = True, show_axis: bool = True, figure_scale: float | None = None, style: Literal['dark_background', 'default'] = 'default', layout: Literal['constrained', 'compressed', 'tight', 'none'] = 'none', interval: float | int = 37, repeat_delay: float | int = 100, blit: bool = True) FuncAnimation[source][source]
Animates the given frames representing 3D coordinates with 3D scatter plot and lines connecting those points.
- Parameters:
frames (Union[Sequence[Sequence[Tuple[float, float, float]]], NDArray]) – The frames to animate, represented as a sequence of collection of 3D coordinates.
line_indexes (Optional[Sequence[Sequence[int]]]) – The indexes of the points in a frame to connect in lines. If not provided, connects the points in a cycle [0, 1, 2, …, n-1, 0].
line_colors (Sequence[Union[Tuple[float, float, float], None]]) – The colors of the lines in RGB format normalized to [0.0, 1.0] range. If not provided, default to a gradient of blue to pink to blue.
line_labels (Sequence[Union[str, None]]) – The labels of the lines.
scatter_color (Tuple[float, float, float]) – The color of the scatter points in RGB format normalized to [0.0, 1.0] range. Default is black.
title (Optional[str]) – The title of the animation. Can include the placeholder “{frame_number}” to display the frame number. Defaults to “{frame_number}”.
vertical_axis (Literal["x", "y", "z"]) – The vertical axis in the plot. Default is “z”.
ticks_scale (Optional[float]) – The scale of the ticks. Defaults to the nearest power of 10 under the range in data.
layout (Literal["constrained", "compressed", "tight", "none"]) – The layout of the plot. Default is “none”.
interval (Union[float, int]) – The interval between frames in milliseconds. Default is 37.
repeat_delay (Union[float, int]) – The delay between replays in milliseconds. Default is 100.
blit (bool) – Whether to use blitting for faster updates. Default is True.
- Returns:
The animation object.
- Return type:
FuncAnimation
- classmethod frames_grid(frames: Sequence[Sequence[Tuple[float, float, float]]] | ndarray[Any, dtype[_ScalarType_co]], subplots: Tuple[int, int], line_indexes: Sequence[Sequence[int]] | None = None, line_colors: Sequence[Tuple[float, float, float] | None] = (), line_labels: Sequence[str | None] = (), scatter_color: Tuple[float, float, float] = (0, 0, 0), scatter_size: float = 2, title: str | None = '{frame_number}', figure_title: str | None = None, figure_title_font_size: float = 20, vertical_axis: Literal['x', 'y', 'z'] = 'z', ticks_scale: float | None = None, azimuth: float = 20, elevation: float = 15, roll: float = 0, azimuth_delta: float = 0, elevation_delta: float = 0, roll_delta: float = 0, invert_x: bool = False, invert_y: bool = False, invert_z: bool = False, show_grid: bool = True, show_axis: bool = True, figure_scale: float | None = 4, style: Literal['dark_background', 'default'] = 'default', layout: Literal['constrained', 'compressed', 'tight', 'none'] = 'tight') Figure[source][source]
Generates a grid of frames with 3D scatter plots and lines connecting the points.
- Parameters:
frames (Union[Sequence[Sequence[Tuple[float, float, float]]], NDArray]) – The frames containing the 3D coordinates of the points.
subplots (Tuple[int, int]) – The number of rows and columns in the figure. Each cell is a 3D plot containing one frame.
line_indexes (Optional[Sequence[Sequence[int]]]) – The indexes of points to be connected with lines.
line_colors (Sequence[Union[Tuple[float, float, float], None]]) – The colors of the lines connecting the points. Color should be in RGB format and in range [0.0, 1.0].
line_labels (Sequence[Union[str, None]]) – The labels for the lines connecting the points.
scatter_color (Tuple[float, float, float]) – The color of the scatter points. Color should be in RGB format and in range [0.0, 1.0].
scatter_size (float) – The size of the scatter points.
title (Optional[str]) – The title of each subplot. Can include the placeholder “{frame_number}” to display the frame number.
figure_title (Optional[str]) – The title of the entire figure.
figure_title_font_size (float) – The font size of the figure title.
vertical_axis (Literal["x", "y", "z"]) – The vertical axis in the 3D plots.
azimuth (float) – The azimuth angle (rotation around the vertical axis) of the initial view in the plot. Value must be in degrees.
elevation (float) – The elevation angle (amount of rise from the horizontal plane) of the initial view in the plot. Value must be in degrees.
roll (float) – The roll angle (rotation around the line of sight) of the initial view in the plot. Value must be in degrees.
azimuth_delta (float) – The change in azimuth angle for each subplot. Value must be in degrees.
elevation_delta (float) – The change in elevation angle for each subplot. Value must be in degrees.
roll_delta (float) – The change in roll angle for each subplot. Value must be in degrees.
invert_x (bool) – Whether to invert the x-axis.
invert_y (bool) – Whether to invert the y-axis.
invert_z (bool) – Whether to invert the z-axis.
show_grid (bool) – Whether to show the grid lines on the axes.
show_axis (bool) – Whether to show the axis lines.
figure_scale (Optional[float]) – The size of the entire figure.
style (Literal["dark_background", "default"]) – The color theme of the plot.
layout (Literal["constrained", "compressed", "tight", "none"]) – The spacing between the subplots.
- Returns:
The generated matplotlib figure.
- Return type:
Figure
- static initialize_Axes3D(ax: Axes, x_limits: Tuple[float, float], y_limits: Tuple[float, float], z_limits: Tuple[float, float], ticks_scale: float = 1.0, azimuth: float = 20, elevation: float = 15, roll: float = 0, vertical_axis: str = 'y', invert_x: bool = False, invert_y: bool = False, invert_z: bool = False, show_grid: bool = True, show_axis: bool = True) None[source][source]
Initializes a 3D Axes object with specified limits, ticks, and settings.
- Parameters:
ax (Axes) – The 3D Axes object to be initialized.
x_limits (Tuple[float, float]) – The range of the x-axis from minimum to maximum value.
y_limits (Tuple[float, float]) – The range of the y-axis from minimum to maximum value.
z_limits (Tuple[float, float]) – The range of the z-axis from minimum to maximum value.
- static new_figure(x_limits: Tuple[float, float], y_limits: Tuple[float, float], z_limits: Tuple[float, float], vertical_axis: Literal['x', 'y', 'z'] = 'z', figure_scale: float | None = 5, style: Literal['dark_background', 'default'] = 'default', layout: Literal['constrained', 'compressed', 'tight', 'none'] = 'compressed', subplots: Tuple[int, int] = (1, 1)) Tuple[Figure, List[Axes]][source][source]
Creates a new 3D figure with the specified subplots and settings.
- static placeholder_scatter_and_lines(ax: Axes, n_lines: int, line_colors: Sequence[Tuple[float, float, float] | None] = (), line_labels: Sequence[str | None] = (), scatter_color: Tuple[float, float, float] = (0, 0, 0), scatter_size: float = 2) Tuple[Path3DCollection, List[Line3D]][source][source]
Update a 3D plot with empty Path3DCollection (scatter) and Line3D objects.
- Parameters:
ax (Axes) – The 3D axes object to plot on.
n_lines (int) – The number of placeholder lines to create.
line_colors (Sequence[Union[Tuple[float, float, float], None]], optional) – The colors of the lines. If not provided, a gradient of colors will be used.
line_labels (Sequence[Union[str, None]], optional) – The labels for the lines.
scatter_color (Tuple[float, float, float], optional) – The RGB color of the scatter points normalized to [0.0, 1.0] range. Defaults to black.
scatter_size (float, optional) – The size of the scatter points. Defaults to 2.
- Returns:
A tuple containing the scatter plot and the list of lines.
- Return type:
Tuple[Path3DCollection, List[Line3D]]
- static set_frame_data(points: Sequence[Tuple[float, float, float]] | ndarray[Any, dtype[_ScalarType_co]], scatter: Path3DCollection, lines: Sequence[Line3D], line_indexes: Sequence[Sequence[int]] = (), ax: Axes | None = None, azimuth_delta: float = 0, elevation_delta: float = 0, roll_delta: float = 0) List[Path3DCollection | Line3D][source][source]
Sets the frame data for visualization.
- Parameters:
points (Union[Sequence[Tuple[float, float, float]], NDArray]) – A collection of tuples or a 2D NDArray representing the (x, y, z) points.
scatter (Path3DCollection) – Object representing the scatter plot.
lines (Sequence[Line3D]) – A sequence of Line3D objects representing the lines to be plotted.
line_indexes (Sequence[Sequence[int]], optional) – indexes of points to connect with lines. Defaults to ().
ax (Optional[Axes], optional) – An optional Axes object to update the view. Defaults to None.
- Returns:
A list containing the updated scatter plot and lines objects.
- Return type:
List[Union[Path3DCollection, Line3D]]
- class sign_language_translator.vision.landmarks.MediapipeConnections[source][source]
Bases:
BaseConnectionsRepresents the connections for the Mediapipe landmark model.
- sign_language_translator.vision.landmarks.get_connections(connections: str) BaseConnections[source][source]
Create a connections object based on the given string
- Parameters:
connections (str) – The name of the connections format to use.
- Returns:
The connections object.
- Return type:
- Raises:
ValueError – If the connections format is not recognized.