sign_language_translator.utils.archive module
Module for working with archives.
- Classes:
Archive: A utility class with static methods for creating, listing, and extracting files from ZIP archives.
- class sign_language_translator.utils.archive.Archive[source]
Bases:
objectThis utility class provides static methods for creating, listing, and extracting files from ZIP archives.
Methods: - create(filename_or_patterns: str | List[str], archive_path: str, compression=zipfile.ZIP_DEFLATED,
progress_bar=True, overwrite=False)
Create a ZIP archive from files matching the specified pattern.
- list(archive_path: str, pattern=”*”, regex: str = r”.*”) -> List[str]
List the files in a ZIP archive, optionally filtered by a glob pattern or regex.
- extract(archive_path: str, pattern: str = “*”, regex: str | re.Pattern = r”.*”, output_dir: str = “.”,
overwrite=False, progress_bar=True, leave=True, password: bytes = None, verbose=True) -> List[str]
Extract files from a ZIP archive to the specified output directory, optionally filtered by file names, patterns, or regex.
Example:
from sign_language_translator.utils import Archive # Create a ZIP archive with files matching a pattern Archive.create("*.txt", "output_archive.zip", overwrite=True) # List files in a ZIP archive using a pattern and a regular expression files = Archive.list("input_archive.zip", pattern="file_*.txt", regex=r"file_\d\.txt") print(files) # Extract files from a ZIP archive to a specified directory extracted_files = Archive.extract("input_archive.zip", pattern="*.txt", output_dir="output_dir", overwrite=True) print(extracted_files)
Note
For file patterns, this class uses glob-style patterns e.g. “*.mp4”.
When extracting files, warnings are issued for skipped files with the same base name.
- static create(filename_or_patterns: str | List[str], archive_path: str, compression=8, progress_bar=True, overwrite=False)[source]
Create a zip archive from files matching the given pattern.
- Parameters:
filename_or_patterns (str | List[str]) – Files or Unix shell-style patterns matching the files to include in the archive.
archive_path (str) – Path to the output zip archive.
compression (int, optional) – Compression method (default is zipfile.ZIP_DEFLATED).
progress_bar (bool, optional) – Show a progress bar during creation (default is True).
overwrite (bool, optional) – Overwrite existing archive (default is False).
- Raises:
FileExistsError – If the archive_path already exists and overwrite is False.
- static extract(archive_path: str, pattern: str = '*', regex: str | Pattern = '.*', output_dir: str = '.', overwrite=False, progress_bar=True, leave=True, password: bytes | None = None, verbose=True) List[str][source]
Extract specified files from a zip archive. Only those files are extracted that match the regex AND the pattern.
- Parameters:
archive_path (str) – Path to the zip archive.
pattern (str) – Unix shell-style wildcard pattern that specifies the files to extract (default is “*”).
regex (str | re.Pattern) – Regular expression pattern that specifies the files to extract (default is “.*”).
output_dir (str, optional) – Directory to extract files into (Default is “.”).
overwrite (bool, optional) – Overwrite existing files during extraction (default is False).
progress_bar (bool, optional) – Show a progress bar during extraction (default is True).
leave (bool, optional) – Leave progress bar displayed upon completion (default is True).
password (bytes, optional) – Password for encrypted archives (default is None).
verbose (bool, optional) – Raise warnings for skipped existing files (default is True).
- Returns:
List of paths to the extracted files and the already extracted matching files.
- Return type:
List[str]
- static list(archive_path: str, pattern: str = '*', regex: str | Pattern = '.*') List[str][source]
List files in the zip archive filtered by the specified pattern or regex.
- Parameters:
archive_path (str) – Path to the zip archive.
pattern (str) – Unix shell-style wildcard pattern to filter the contents (default is “*”).
regex (str | re.Pattern) – Regular expression pattern to filter the contents (default is “.*”).
- Returns:
List of file names in the archive that match the criteria.
- Return type:
List[str]