sign_language_translator.utils.archive module

Module for working with archives.

Classes:

Archive: A utility class with static methods for creating, listing, and extracting files from ZIP archives.

class sign_language_translator.utils.archive.Archive[source]

Bases: object

This utility class provides static methods for creating, listing, and extracting files from ZIP archives.

Methods: - create(filename_or_patterns: str | List[str], archive_path: str, compression=zipfile.ZIP_DEFLATED,

progress_bar=True, overwrite=False)

Create a ZIP archive from files matching the specified pattern.

  • list(archive_path: str, pattern=”*”, regex: str = r”.*”) -> List[str]

    List the files in a ZIP archive, optionally filtered by a glob pattern or regex.

  • extract(archive_path: str, pattern: str = “*”, regex: str | re.Pattern = r”.*”, output_dir: str = “.”,

    overwrite=False, progress_bar=True, leave=True, password: bytes = None, verbose=True) -> List[str]

    Extract files from a ZIP archive to the specified output directory, optionally filtered by file names, patterns, or regex.

Example:

from sign_language_translator.utils import Archive

# Create a ZIP archive with files matching a pattern
Archive.create("*.txt", "output_archive.zip", overwrite=True)

# List files in a ZIP archive using a pattern and a regular expression
files = Archive.list("input_archive.zip", pattern="file_*.txt", regex=r"file_\d\.txt")
print(files)

# Extract files from a ZIP archive to a specified directory
extracted_files = Archive.extract("input_archive.zip", pattern="*.txt", output_dir="output_dir", overwrite=True)
print(extracted_files)

Note

  • For file patterns, this class uses glob-style patterns e.g. “*.mp4”.

  • When extracting files, warnings are issued for skipped files with the same base name.

static create(filename_or_patterns: str | List[str], archive_path: str, compression=8, progress_bar=True, overwrite=False)[source]

Create a zip archive from files matching the given pattern.

Parameters:
  • filename_or_patterns (str | List[str]) – Files or Unix shell-style patterns matching the files to include in the archive.

  • archive_path (str) – Path to the output zip archive.

  • compression (int, optional) – Compression method (default is zipfile.ZIP_DEFLATED).

  • progress_bar (bool, optional) – Show a progress bar during creation (default is True).

  • overwrite (bool, optional) – Overwrite existing archive (default is False).

Raises:

FileExistsError – If the archive_path already exists and overwrite is False.

static extract(archive_path: str, pattern: str = '*', regex: str | Pattern = '.*', output_dir: str = '.', overwrite=False, progress_bar=True, leave=True, password: bytes | None = None, verbose=True) List[str][source]

Extract specified files from a zip archive. Only those files are extracted that match the regex AND the pattern.

Parameters:
  • archive_path (str) – Path to the zip archive.

  • pattern (str) – Unix shell-style wildcard pattern that specifies the files to extract (default is “*”).

  • regex (str | re.Pattern) – Regular expression pattern that specifies the files to extract (default is “.*”).

  • output_dir (str, optional) – Directory to extract files into (Default is “.”).

  • overwrite (bool, optional) – Overwrite existing files during extraction (default is False).

  • progress_bar (bool, optional) – Show a progress bar during extraction (default is True).

  • leave (bool, optional) – Leave progress bar displayed upon completion (default is True).

  • password (bytes, optional) – Password for encrypted archives (default is None).

  • verbose (bool, optional) – Raise warnings for skipped existing files (default is True).

Returns:

List of paths to the extracted files and the already extracted matching files.

Return type:

List[str]

static list(archive_path: str, pattern: str = '*', regex: str | Pattern = '.*') List[str][source]

List files in the zip archive filtered by the specified pattern or regex.

Parameters:
  • archive_path (str) – Path to the zip archive.

  • pattern (str) – Unix shell-style wildcard pattern to filter the contents (default is “*”).

  • regex (str | re.Pattern) – Regular expression pattern to filter the contents (default is “.*”).

Returns:

List of file names in the archive that match the criteria.

Return type:

List[str]