sign_language_translator.vision.video.video_iterators module

This module provides a flexible and unified interface for working with video frames.

This module defines an abstract class and its implementations for efficient video frame retrieval. It includes various classes for accessing video frames, seeking, and caching from various sources.

Classes: - VideoFrames: An abstract base class for video frame retrieval. - VideoCaptureFrames: A class for efficient video frame retrieval from a video file using OpenCV. - SequenceFrames: A class for representing a sequence of video frames. - IterableFrames: Represents an iterable video frame source, allowing random access to frames by index or timestamp.

class sign_language_translator.vision.video.video_iterators.IterableFrames(frames: Iterable[ndarray[Any, dtype[uint8]]], total_frames: int, fps: float = 30.0, use_cache=True)[source]

Bases: VideoFrames

Represents a read-once iterable video frame source, allowing random access to frames by index or timestamp.

This class extends the VideoFrames abstract class and is specifically designed to work with iterable sources of video frames such as generators. It maintains an internal cache of frames to efficiently access frames by index or timestamp.

Parameters:
  • frames (Iterable[NDArray[np.uint8]]) – An iterable source of video frames.

  • total_frames (int) – The total number of frames in the video.

  • fps (float, optional) – The frames per second of the video. Defaults to 30.0.

  • use_cache (bool, optional) – Whether to store the frames when they have been read from the iterable. Defaults to True.

frames_iterable

An iterator over the provided frames.

Type:

iter

frames_cache

A cache to store frames for efficient retrieval.

Type:

dict

fps

The frames per second of the video.

Type:

float

total_frames

The total number of frames in the video.

Type:

int

get_frame(timestamp

float = None, index: int = None) -> NDArray[np.uint8]: Retrieve a video frame by specifying either a timestamp or an index.

close()[source]

Close the video frame source, resetting the frames_iterable and clearing the frames_cache.

close()[source]

Close the video frame source, resetting the frames_iterable and clearing the frames_cache.

property duration: float

total time that the frames would take to play in a sequence. depends on fps.

get_frame(timestamp: float | None = None, index: int | None = None) ndarray[Any, dtype[uint8]][source]

Retrieve a video frame by specifying either a timestamp or an index.

Parameters:
  • timestamp (float | None, optional) – The timestamp (in seconds) of the desired frame. If provided, it will be used to locate the frame in the video. Defaults to None.

  • index (int | None, optional) – The index of the desired frame. If provided, it will be used to locate the frame in the video. Defaults to None.

Returns:

The video frame as a NumPy array of unsigned 8-bit integers.

Return type:

NDArray[np.uint8]

Raises:

RuntimeError – If the specified timestamp or index is out of range or if there is an error reading the frame from the frames_iterable.

Note

You can retrieve frames by either timestamp or index. The timestamp allows you to seek to a specific point in time, while the index allows you to access frames in a sequential order.

property height: int

Number of pixels vertically present in the video frame.

property n_channels: int

Number of color channels in the video frames.

property width: int

Number of pixels horizontally present in the video frame.

class sign_language_translator.vision.video.video_iterators.SequenceFrames(frames: Sequence[ndarray[Any, dtype[uint8]]], fps: float = 30.0)[source]

Bases: VideoFrames

A class for representing a sequence of video frames.

This class extends the VideoFrames abstract class to work with a predefined sequence of frames, allowing easy access to individual frames within the sequence.

Parameters:
  • frames (Sequence[NDArray[np.uint8]]) – A sequence of video frames, where each frame is represented as a NumPy array with data type np.uint8.

  • fps (float | None, optional) – The frames per second (FPS) of the video. If not specified, it can be set to None.

frames

The sequence of video frames.

Type:

Sequence[NDArray[np.uint8]]

fps

The frames per second (FPS) of the video. Defaults to 30.0.

Type:

float

Note

The SequenceFrames class inherits from the VideoFrames class.

close()[source]

Close the SequenceFrames instance by clearing the frames.

This method releases resources associated with the frames by clearing the frames list. After calling this method, the frames will no longer be available.

property duration: float

total time that the frames would take to play in a sequence. depends on fps.

get_frame(timestamp: float | None = None, index: int | None = None) ndarray[Any, dtype[uint8]][source]

Retrieve a video frame based on the specified timestamp or index.

Parameters:
  • timestamp (float | None, optional) – The timestamp in seconds at which to retrieve the frame. If not provided, index is used.

  • index (int | None, optional) – The index of the frame to retrieve. If not provided, timestamp is used.

Returns:

The video frame as a NumPy array with data type np.uint8.

Return type:

NDArray[np.uint8]

property height: int

Number of pixels vertically present in the video frame.

property n_channels: int

Number of color channels in the video frames.

property total_frames: int

total number of frames present in the sequence. (dimension=0)

property width: int

Number of pixels horizontally present in the video frame.

class sign_language_translator.vision.video.video_iterators.VideoCaptureFrames(path: str, use_cache=False, cache_len=256, **kwargs)[source]

Bases: VideoFrames

A class for efficient video frame retrieval from a video file using OpenCV.

This class extends the functionality of the VideoFrames abstract class to provide features for video frame access, seeking, and caching.

Parameters:
  • path (str) – The path to the video file.

  • use_cache (bool, optional) – Enable or disable frame caching. Default is False.

  • cache_len (int, optional) – Maximum number of frames to cache if use_cache is enabled. Default is 256.

  • **kwargs – Additional keyword arguments to pass to the base VideoFrames class.

path

The path to the video file.

Type:

str

fps

Frames per second of the video.

Type:

float

total_frames

Total number of frames in the video.

Type:

int

_width

Width of video frames.

Type:

int

_height

Height of video frames.

Type:

int

fourcc

FourCC code representing the video codec.

Type:

int

duration

Duration of the video in seconds.

Type:

float

_frames_cache

A dictionary for frame caching.

Type:

dict

use_cache

True if frame caching is enabled, False otherwise.

Type:

bool

_max_cache_len

Maximum number of frames to cache.

Type:

int

_n_channels

Number of color channels in the video frames.

Type:

int

get_frame(timestamp

float = None, index: int = None) -> NDArray[np.uint8]: Retrieve a video frame based on either a timestamp or an index.

current_index() int

Get the current index of the video frame being read.

seek(timestamp

float = None, index: int = None): Move the video frame position to the specified timestamp or index.

read_frame() NDArray[np.uint8] | None[source]

Read and return the next frame from the video.

close()[source]

Close the video capture and release associated resources.

Notes

  • Frame caching can improve performance by storing previously accessed frames in memory.

  • The seek method employs efficient seeking techniques based on time and frame index.

  • When finished, remember to call the close method to release video resources.

Example:

video = VideoCaptureFrames("video.mp4", use_cache=True)
frame = video.get_frame(timestamp=10.0)
video.seek(index=100)
frame = video.read_frame()
video.close()
close()[source]

Release the video capture resource and clear the frame cache.

property current_index: int

Where the VideoCapture is currently pointing to.

property duration: float
get_frame(timestamp: float | None = None, index: int | None = None) ndarray[Any, dtype[uint8]][source]

Retrieve a video frame at a specified timestamp or index.

Parameters:
  • timestamp (float | None) – The timestamp in seconds.

  • index (int | None) – The frame index.

Returns:

The video frame as a NumPy array.

Return type:

NDArray[np.uint8]

Raises:

RuntimeError – If frame retrieval fails.

property height: int

Number of pixels vertically present in the video frame.

property n_channels: int

Number of color channels in the video frames.

read_frame() ndarray[Any, dtype[uint8]] | None[source]

Read the next frame from the video.

Returns:

The next video frame as a NumPy array,

or None if no more frames are available.

Return type:

NDArray[np.uint8] | None

seek(timestamp: float | None = None, index: int | None = None)[source]

Seek to a specified timestamp or frame index.

Parameters:
  • timestamp (float | None) – The timestamp in seconds.

  • index (int | None) – The frame index.

Returns:

None

property width: int

Number of pixels horizontally present in the video frame.

class sign_language_translator.vision.video.video_iterators.VideoFrames[source]

Bases: ABC

Abstract Base Class for Video Frames

VideoFrames is an abstract base class that defines a common interface for video frame retrieval. Subclasses of VideoFrames are expected to implement methods for accessing video frames, releasing resources, and providing information about the video.

Methods: - get_frame(timestamp: float = None, index: int = None) -> NDArray[np.uint8]:

Get a frame at a given timestamp or index from the video object.

  • close():

    Release the resources occupied by the object.

  • __len__() -> int:

    Return the number of frames in the video object.

Properties: - height: int

Number of pixels vertically present in the video frame.

  • width: int

    Number of pixels horizontally present in the video frame.

  • n_channels: int

    Number of color channels in the video frames.

abstract close()[source]

Release the resources occupied by the object.

abstract get_frame(timestamp: float | None = None, index: int | None = None) ndarray[Any, dtype[uint8]][source]

Get a frame at a given timestamp or index from the video object.

abstract property height: int

Number of pixels vertically present in the video frame.

abstract property n_channels: int

Number of color channels in the video frames.

abstract property width: int

Number of pixels horizontally present in the video frame.

class sign_language_translator.vision.video.video_iterators.VideoSource(source: VideoFrames, start_index: int = 0, end_index: int | None = None, step_size: int = 1, transformations: List[Callable] | None = None)[source]

Bases: VideoFrames

base_index(relative_index: int) int[source]
close()[source]

Release the resources occupied by the object.

get_frame(timestamp: float | None = None, index: int | None = None) ndarray[Any, dtype[uint8]][source]

Get a frame at a given timestamp or index from the video object.

property height: int

Number of pixels vertically present in the video frame.

property n_channels: int

Number of color channels in the video frames.

property step_size: int
property width: int

Number of pixels horizontally present in the video frame.