[icon]

ziptools — The Bits That Python's zipfile Forgot

Version:  1.2, April 11, 2020 (changes)
License:  Provided freely, but with no warranties of any kind
Author:  © M. Lutz (learning-python.com) 2017-2020
Run with:  Python 3.X or 2.X on Mac OS, Windows, Linux, Android, etc.
Install:  Unzip the download, no third-party libraries are required
Web page:  learning-python.com/ziptools.html

This source-code package wraps Python's zipfile module to simplify common usage, and extends it with crucial extra utility—including support for adding entire folders, propagating modtimes and permissions, archiving symlinks on Unix and Windows, omitting cruft files, using long paths on Windows, and neutralizing the DST and timezone problems of zip modtimes.

In addition, ziptools is dual-mode software, providing:

Both foster flexible and portable content management using zipfiles, above and beyond Python's standard library.

This file is ziptools' first-level user guide; its source-code files here and here provide additional and lower-level details. If you have used ziptools in the past, see Versions at the end of this file for recent updates. Contents of this guide:

Quickstart

This section presents basic examples, as a quick introduction and reference. In all of this guide's examples:

General Command Formats

In the following command-line formats, items enclosed in [] are optional, and ... means zero or more:

[py] zip-create.py [zipfile source [source...] [-skipcruft] [-atlinks] [-zip@path]]

[py] zip-extract.py [zipfile [unzipto] [-nofixlinks] [-permissions]]

[py] zip-list.py [zipfile]

Where:

In program-library mode, most function arguments correspond to command-line switches:

createzipfile(zipname, [addnames],
              storedirs=True, cruftpatts={}, atlinks=False, trace=print, zipat=None)
                     
extractzipfile(zipname, pathto='.',
               nofixlinks=False, trace=print, permissions=False)

Command-Line Examples

Run the following commands in a console, script, or supporting IDE:

$ py $Z/zip-create.py myarchive.zip mycontent -skipcruft
      # Store all of folder mycontent in new zipfile myarchive.zip

$ py $Z/zip-extract.py myarchive.zip myunzipdir
      # Extract the contents of myarchive.zip to folder myunzipdir

$ py $Z/zip-extract.py myarchive.zip myunzipdir -permissions
      # Ditto, but also propagate Unix-style permissions to all

$ py $Z/zip-create.py ../myarchive.zip * -skipcruft
      # Store folder contents as top-level items, on Unix or Windows 

$ py $Z/zip-create.py mysoucecode.zip page.js gui.pyw *.py *.c
      # Store individual files as top-level items, on Unix or Windows 

$ py $Z/zip-create.py mysoucecode.zip src/latest/*.py -zip@.
      # Store individual files as top-level items, on Unix or Windows 

$ py $Z/zip-list.py myarchive.zip
      # List the contents of zipfile myarchive.zip

$ py $Z/zip-create.py
      # Interactive mode: input parameters at prompts (see ahead)

Program-Library Examples

Run the following Python code samples in a script or at the >>> interactive prompt; they correspond to command lines above:

$ export PYTHONPATH=$Z:$PYTHONPATH
$ py
>>> import ziptools, glob
>>> cruftdflt = ziptools.cruft_skip_keep

>>> ziptools.createzipfile('myarchive.zip', ['mycontent'], cruftpatts=cruftdflt)

>>> ziptools.extractzipfile('myarchive.zip', pathto='myunzipdir')

>>> ziptools.extractzipfile('myarchive.zip', pathto='myunzipdir', permissions=True)

>>> ziptools.createzipfile('../myarchive.zip', glob.glob('*'), cruftpatts=cruftdflt)

Related-Tool Examples

In the following example command lines, $C is your source-code install folder; these Mergeall scripts are available here:

$ py $C/mergeall/mergeall.py mycontent myunzipdir/mycontent -report -skipcruft
      # Verify result modtimes using Mergeall

$ py $C/mergeall/diffall.py  mycontent myunzipdir/mycontent -skipcruft
      # Verify result content using Mergeall

Overview

This section explains why you may want to use ziptools, and describes its features. In brief, Python's standard-library zipfile module does great low-level work, but the ziptools package adds both much-needed features and higher-level access points, and documents some largely undocumented dark corners of Python's zip support along the way. Among its augmentations, ziptools:

  1. Adds entire folder trees to zipfiles automatically
  2. Propagates original modtimes for files, folders, and links
  3. Can either include or skip system "cruft" files on request
  4. Supports symlinks to files and folders on Unix and Windows
  5. Supports long paths to items on Windows beyond its normal limits
  6. Propagates Unix file-access permissions for all items on request
  7. Fixes DST and timezone skew in modtimes with UTC timestamps

Although Python's shutil module has basic zipfile wrappers that add folders (make_archive) and extract all items (unpack_archive), they don't do anything about modtimes, permissions, symlinks, system cruft, long Windows paths supported here, or the limits of zip's local times. The next sections describe how ziptools does better on all these fronts.

Folder Trees

With ziptools, folder trees are added to zipfiles as a whole automatically with extra code—a sorely missed feature of Python's standard modules. Folder trees are also automatically extracted from zipfiles in full, and no user steps are required to enable folder-tree zips and unzips.

For folder-tree implementation details, see the create functions in ziptools.py.

Modtime Propagation

With ziptools, latest-modification times (a.k.a. modtimes) are automatically propagated to and from zipfiles for all items—files, folders, and symlinks. This is another glaring omission in Python's standard module, which stores modtimes in zips but ignores them on unzips. modtimes are crucial when unzipped results are used with tools that rely on file-modification timestamps—including the Mergeall program's incremental backups, and most source-control systems' change detection.

No user actions are necessary to enable modtime propagation. As of 1.2, this feature has grown even more useful with 1.2's UTC timestamps described ahead; this neutralizes the limits inherent in the zip standard's local time. For the implementation of basic modtime propagation, see the extract function in ziptools.py.

Cruft-File Skipping

With ziptools, platform-specific metadata items can be automatically omitted from cross-platform zipfile archives. When enabled in creates, this ensures that these items won't wind up as clutter in contexts where they serve no purpose.

This feature is optional and configurable, and works for files, folders, and symlinks. To omit cruft items in zips, use the -skipcruft command-line switch or corresponding function argument. As shipped, this avoids propagating .DS_Store Finder files and ._* Apple-double resource forks on Mac OS; Desktop.ini files on Windows; .Trash* and other recycle bins; and most other .* files hidden per Unix convention which don't belong in zipped content.

For more control, you can also define what to skip in your use case. Cruft is identified with skip and keep filename patterns—either a custom set, or a provided general-purpose default used automatically by the command-line scripts. See zip-create.py and ziptools.py for more details, and zipcruft.py for the default-but-changeable cruft patterns shipped.

Symlinks Propagation

With ziptools, symbolic links (a.k.a. symlinks) to both files and folders are supported and propagated on both Unix and Windows. By default, symlinks are always copied verbatim to and from zipfiles, and their path separators are made portable along the way. No user action is required to enable this behavior.

Optionally, clients may also elect to instead copy the items which links reference with -atlinks and can opt out of link portability transforms with -nofixlinks (or their corresponding function arguments). In more detail:

Scope: symlinks do survive round trips between Unix and Windows, but their support is uneven across platforms and Pythons. Windows symlinks, for example, do not retain their original modtimes on unzips in 3.X, and their permissions are largely moot. Moreover, unzips require admin permissions to create symlinks on Windows; Android may not support symlink creation at all, and leave stub files on unzips; and Python's symlink support varies by version. See Portability ahead for more fine-grained details on symlink interoperability.

For a demo of ziptools' symlinks support, see the console log here. For symlink implementation details, see zipsymlinks.py. See also zip-create.py for more on -atlinks; zip-extract.py for more on -nofixlinks; and ziptools.py for more on both usage options.

Long Paths on Windows

With ziptools, filesystem pathnames of all items on Windows are allowed to exceed their normal 260/248-character length limit on that platform. This is accomplished by automatically prefixing paths with \\?\ whenever they are passed to tools in the underlying Python zipfile module, and other file-related interfaces.

No user action is required for this feature. On all versions of Windows, it supports files, folders, and symlinks at long Windows paths both when adding to and extracting from zip archives. Among other things, this is useful for unzipping and rezipping long-path items originally zipped on more-flexible Unix hosts, where this feature is unneeded and unused.

For implementation details, see ziplongpaths.py.

Windows path-limit install option in Python 3.6+: on recent Windows 10 systems, the standard python.org installers for Python 3.6 and later include an option to automatically remove the Windows platform's former path-length limit. ziptools, and its Mergeall cousin, instead use the manual but more-inclusive technique described here to lift the limit for users of all Pythons and all Windows. While useful, the Python 3.6+ enhancement is optional and easy to miss; won't apply to frozen executables; and doesn't help those using Windows 7 and 8, or Pythons 2.X through 3.5—still-substantial audiences all. For more details, see this online usage note.

Permissions Propagation (New in 1.1)

With ziptools, Unix-style permissions for all items—files, folders, and symlinks—can be propagated both to and from zipfiles, and hence survive a zip and unzip. Specifically:

Due to interoperability issues, this new extract option should generally be used only when unzipping from zipfiles known to have originated on Unix, and when unzipping back to Unix. Most use cases that require permissions to survive trips to/from zips probably satisfy this rule, but propagation isn't enabled by default because origin cannot reliably be determined. Propagation is generally harmless where Unix-style permissions are not supported: it doesn't abort ziptools zips or unzips, though results may lose permissions per the following note.

Scope: not all filesystems and platforms support Unix-style permissions. On exFAT, for instance, permission updates silently change nothing, and the extract option has no effect. It's okay to copy a zipfile to and from an exFAT drive as a whole, but don't unzip on exFAT if you care about retaining Unix permissions. Unix-style permissions also don't fully translate to Windows, and may not work at all on Android. See Portability ahead for full details, and don't send permissions on round trips to platforms that don't support them.

For a demo of 1.1 permissions propagation, see the screenshot here and the console logs here and here. For implementation details, see ziptools.py and zipsymlinks.py.

UTC Timestamps (New in 1.2)

With ziptools, the modtimes of files, folders, and symlinks are automatically made immune to changes in both timezone and DST (Daylight Savings Time). This is achieved by storing UTC (a.k.a. GMT) timestamps in one of the "extra fields" defined by the zipfile standard. This feature is always applied, and no user action is required to enable it. In more detail:

This is a full fix to zip's local-time issues: because UTC timestamps are relative to a fixed point, they are agnostic to both timezone and DST changes. By contrast, local time is scaled for both, and may be better used for display. The zip standard's use of local time for file storage, however, makes it difficult to adjust for DST, and impossible to adjust for timezones—especially given zip's lack of timezone information (if you fly three timezones away, zip local times won't change with you). UTC is the only way to keep zipped modtimes comparable in all contexts with the original data.

Prior to 1.2, ziptools used the zipfile's main local time, and deferred to Python's library tools here and here to translate UTC time to and from local time and accommodate DST changes. That scheme's modtime results could vary from those of some other zip tools after DST changes, however, and did nothing about timezone changes. The new UTC timestamp scheme in 1.2 eliminates both DST and timezone modtime skew in a single step.

Interoperability: not all zip tools use or record the extra field added by ziptools 1.2, but the absence of support is harmless. When processing zipfiles with the new field, tools that don't recognize it will simply skip it, per zip standard. When processing zipfiles without the new field, ziptools will fall back on its original and subpar local-time scheme. More positively, ziptools will use UTC-timestamp modtimes recorded by other tools in the same fashion as its own.

For a demo of 1.2 UTC timestamps in action, see the console log here, and the screenshot here. For implementation details, see zipmodtimeutc.py, and its ziptools.py and zipsymlinks.py clients.

Etcetera

Beyond the preceding sections' features, the ziptools package also provides command-line zip and unzip programs that work portably on Mac OS, Windows, Linux, Android, and more; runs all its code on either Python 3.X or 2.X; and comes with complete and open-source Python code that you can use, audit, and adapt.

Again, see the main zip-create.py and zip-extract.py command-line scripts for more usage details omitted here, and the ziptools.py module they use for lower-level implementation details.

Usage

This section provides expanded usage examples to demo typical ziptools roles, and documents additional usage details along the way. For more examples external to this guide, the test-case folders:

all capture usage and runs, and each script and module in this package includes in-depth documentation strings with usage details omitted here for space. In addition, the documentation folders:

demonstrate the new features in versions 1.1 and 1.2; the screenshots here and here capture basic usage; and Mergeall's test folder illustrates its related symlink support.

General Concepts

This section presents ziptools usage pointers that span modes. In general:

Portability
All code in this package works under both Python 3.X and 2.X, and on both Unix and Windows, though some advanced tools (e.g., symlinks and permissions) are platform or Python-version dependent. ziptools' zipfiles are generally interoperable with other zip tools, and vice versa, though some features may be supported unevenly elsewhere. When in doubt, use ziptools for both zips and unzips, and see Portability ahead for complete interoperability details.
Pathnames
Items added to zip archives are recorded with their relative or absolute paths given, less any Windows drive and UNC parts, and most .. relative-path syntax. Items are later unzipped to these recorded paths relative to an unzip target folder. Leading ..s may be recorded by zips, but are stripped by unzips. Tip: to reduce the length of paths saved and restored, cd to source folders before zipping —or see the next item on this list for a 1.2 alternative.
Alternate zip paths
As of 1.2, the create -zip@path command-line switch (and corresponding function argument) allows zipped items to be recorded, and hence unzipped, under an alternate path, or none if path is .. This is useful to shorten long pathnames in zip sources and thus avoid nesting on unzips, and can make a pre-zip cd unnecessary.

This feature is best described by example. Normally, a source item A/B/C/D is zipped at path A/B/C/D, and later unzipped at same path relative to (i.e., nested in) the unzip target folder. Given -zip@C at zip time, the item is instead zipped and unzipped at C/D, and using -zip@. makes D a top-level, unnested item in both zip and unzip. When used, this feature also yields two output lines per item to show zip paths used (zip-list.py shows results too).

Two fine points: first, note that the alternate path is applied to every source item zipped, and so may be more useful when zipping either a single item or many items in the same folder (else same-named items from different folders will collide). Second, the path replacement applies only to the top-level paths given; items nested in zipped folders are still nested in the zip (though perhaps at cropped root paths).

For a demo of this feature at work, see the console log here. Per this demo, -zip@. can also be used to remove folder nesting for all items that match * wildcards—as described further in the next section.

Windows PowerShell users: you must quote the new switch in command lines as either "-zip@path" or '-zip@path'. PowerShell uses a proprietary scripting language which treats @ specially, and unfortunately clashes with ziptools' argument. You do not need to quote this argument on Windows in either Command Prompt; the standard bash shell in Windows 10's Linux subsystem (WSL); or the Unix shells available in the Cygwin package. Users familiar with Unix may prefer the last two of these options, captured here and here. Windows, being Windows, has long spun a convoluted and fragmented command-line tale; such is life in battling-towers development.

Source globbing
When using the create script (only), source items may use filename expansion (a.k.a. globbing) on both Unix and Windows: * matches any number of characters; ? matches any single character; [xxx] matches any character in xxx; [!xxx] matches any character not in xxx; and brackets escape literals (e.g., [?] matches ?). All items with matching names are zipped. This feature also works when arguments are prompted in interactive mode (see ahead).

For instance, a source item *.py on any platform zips all Python source files in the current folder as top-level, unnested items. Similarly, a source item folderpath/*.py zips all Python files located in another folder; use the -zip@path feature of the preceding note to shorten, change, or remove folderpath for each matching file zipped (e.g., -zip@. zips all the files matched by folderpath/*.py as top-level, unnested items). For a demo of this feature, see the console log here.

Source .
Using . as a zip source item zips all items in the current directory as unnested items, and is functionally equivalent to a lone * to match all items (except that * omits any .* items hidden per Unix convention). This may be more useful for command lines than for function calls in program-library mode, given that callers might be running in a program folder.
Target .
Using . as the unzip target folder unzips each item in the zipfile to the current directory, with no extra folder nesting apart from that recorded in the zipfile itself (unnested items in the zip are unnested in .). Caution: this silently overwrites any same-named items in .; name another folder instead of . if you wish to avoid this.
Existence
In all modes, an error is reported for zip source items that do not exist, but unzip target folders are automatically created if they do not exist. In command-line mode, nonexistent sources are detected before a zip starts, and abort the request with an error message. In program-library mode, nonexistent sources are simply skipped with a message, because callers are responsible for input correctness.
Compression
Items in archives created by ziptools itself are compressed using the Python zipfile module's ZIP_DEFLATED setting; this isn't configurable, but it's the usual method, and generally does a reasonable job. Archives unzipped by ziptools support any compression method supported by the installed zipfile (see its docs).
File size
ziptools always uses ZIP64 extensions when needed to support files much larger than zip's former limits, via this option in Python's zipfile module. While file size is practically unlimited with ZIP64, some other tools may fail or refuse to extract zipfiles larger than 2G (e.g., Unix unzip), and others may balk at 4G. If you run into limits, your best option may be to find or install a Python 2.X or 3.X on the unzip host, and run ziptools' zip-extract.py.

See the create and extract scripts for more on pathnames and other details in ziptools. The following sections move on to present examples by usage mode.

Program-Library Mode

ziptools can be used from both command lines shown ahead, and direct Python program calls like those demonstrated here.

In program mode only, callers are responsible for expanding any wildcard operators like * in source filenames (a.k.a. globbing), before calling ziptools' create function; use Python's glob.glob. Program mode can also leverage other Python tools to construct source lists, as is responsible for source existence per above.

Cruft skipping is enabled in program mode by passing a dictionary of skip and keep patterns to the create function's cruftpatts. This argument defaults to {} which disables skipping. To enable, pass either a custom definition or the default imported from zipcruft.py.

As usual in Python for simple source-code zips, ziptools' install (unzip) folder must be on your module search path. An export command on Unix or similar set on Windows in your shell start-up files generally suffices, though other options abound; your platform's documentation and other resources can provide more details.

See ziptools.py for more on program usage, and the arguments demonstrated here:

# Setup (Z is ziptools' install folder)
$ export PYTHONPATH=$Z:$PYTHONPATH    # Unix
$ set PYTHONPATH=%Z%;%PYTHONPATH%     # Windows

# Basic usage
import ziptools
ziptools.createzipfile(zipto, sources)
ziptools.extractzipfile(zipfrom, unzipto)

# Test folders
ziptools.createzipfile('test-1-2.zip', ['test1', 'test2'])
ziptools.extractzipfile('test-1-2.zip', '.')

# Websites, with cruft skips
from ziptools.zipcruft import cruft_skip_keep
ziptools.createzipfile('website.zip', ['website'], cruftpatts=cruft_skip_keep)
ziptools.extractzipfile('website.zip', '~/public_html', permissions=True)

# Development
ziptools.createzipfile('devtree.zip', ['dev'])
ziptools.extractzipfile('devtree.zip', '.', permissions=True)

# Symlink options
ziptools.createzipfile('filledintree.zip', ['skeleton'], atlinks=True)
ziptools.extractzipfile('nonportable_devtree.zip', '.', nofixlinks=True)

# Manual globs
from glob import glob
ziptools.createzipfile('allsourcecode.zip', glob('*.py') + glob('*.c'))

# Use 1.2 alternate path to minimize or eliminate folder nesting in zips
ziptools.createzipfile('folder.zip', ['long/path/to/folder'], zipat='to')
ziptools.createzipfile('pycode.zip' glob('some/other/folder/*.py'), zipat='.')

Command-Line Mode

The following sorts of ziptools command lines may be run from your system's command shell (e.g., Terminal on Unix, Command Prompt on Windows, or Termux on Android); another program (e.g., using Python's os.popen); or IDEs that support system command lines (e.g., PyEdit).

In all cases and on all platforms, ziptools' create script expands filename wildcards (e.g., * and ?) not already expanded by a command shell, and . in creates and extracts refers to the current directory. See the prior section for more on expansion and dots.

For brevity and convenience, most of these examples use a Unix shell-variable reference $Z and assume that this variable has been set by an export or similar prior to script runs (e.g., $Z/zip-create.py). In Windows Command Prompt, a %Z% after running a set command may be used instead (e.g., %Z%\zip-create.py)

See zip-create.py and zip-extract.py for more the command-lines demonstrated here:

# Setup (optional)
$ export Z=ziptoolspath    # Unix
$ set Z=ziptoolspath       # Windows

# Test folders
c:\...\ziptools> zip-create.py cmdtest\ziptest.zip selftest\test1 selftest\test2
c:\...\ziptools> zip-list.py cmdtest\ziptest.zip
c:\...\ziptools> zip-extract.py cmdtest\ziptest.zip cmdtest\unzipped

# Websites
...local$  python3 $Z/zip-create.py ~/website.zip . -skipcruft
...remote$ python2 $Z/zip-extract.py ~/website.zip public_html -permissions

# Distributions
...devdir$ python3 $Z/zip-create.py program.zip programdir -skipcruft
...usedir$ python3 $Z/zip-extract.py program.zip .

# Development
...dir1$ python $Z/zip-create.py devtree.zip dev -skipcruft
...dir2$ python $Z/zip-extract.py devtree.zip . -permissions

# Special cases: populating from links, retaining link separators
...dir1$ python $Z/zip-create.py devtree.zip dev -skipcruft -atlinks
...dir2$ python $Z/zip-extract.py devtree.zip . -nofixlinks

# Individual items
...here$ python $Z/zip-create.py allcode.zip a.py b.py c.py d.py
...here$ python $Z/zip-create.py allcode.zip a.py b.py folder -skipcruft

# Shell pattern expansion: supported on all platforms in [1.1]
...here$ python $Z/zip-create.py allcode.zip *.py
...here$ python $Z/zip-create.py allcode.zip *.py test[12].txt doc?.html

# Use items in a folder as top-level items, not nested in their folder
--cd source dir
...src$ python $Z/zip-create.py ../allcode.zip * -skipcruft
--cd dest dir, copy allcode.zip to .
...dst$ python $Z/zip-extract.py allcode.zip . -permissions

# Use 1.2 alternate path to minimize or eliminate folder nesting in zips
...here$ python $Z/zip-create.py folder.zip long/path/to/folder -zip@to -skipcruft
...here$ python $Z/zip-create.py pycode.zip some/other/folder/*.py -zip@.

Interactive Mode

When ziptools is run from a command line with no arguments, it falls back on asking for inputs interactively (e.g., from a user or piped-in replies file), as in the examples that follow. In these listings, substitute \ for all / when working on Windows.

In this mode, expansion and dots work the same as they do in command lines. For extracts, interactive mode (only) also asks if you wish to clean (i.e., delete) the contents of the target folder first, if the folder already exists. If you opt to not clean, the unzip will add to the folder's current content.

To extract an existing zipfile to ., the current directory:

.../test-symlinks$ $Z/zip-extract.py
Zip file to extract? save-test1-test2.zip
Folder to extract in (use . for here) ? .
Do not localize symlinks (y=yes)? 
Retain access permissions (y=yes)? 
About to UNZIP
      save-test1-test2.zip,
      to .,
      localizing any links,
      not retaining permissions
Confirm with 'y'? y
Clean target folder first (yes=y)? n
Unzipping from save-test1-test2.zip to .
Extracted test1/
             => test1
...etc...

To create a new zipfile in a folder, from items in two other folders (run in ziptools' own folder for variety):

/Code/ziptools$ zip-create.py
Zip file to create? cmdtest/ziptest
Items to zip (comma separated)? selftest/test1, selftest/test2                   
Skip cruft items (y=yes)? y
Follow links to targets (y=yes)? n
Alternate zip path (.=unnested, enter=none) ? 
About to ZIP
      ['selftest/test1', 'selftest/test2'],
      to cmdtest/ziptest.zip,
      skipping cruft,
      not following links
      zip@ path (unused)
Confirm with 'y'? y
Zipping ['selftest/test1', 'selftest/test2'] to cmdtest/ziptest.zip
Cruft patterns: {'skip': ['.*', '[dD]esktop.ini', 'Thumbs.db', '~*', '$*', '*.py[co]'], 'keep': ['.htaccess']}
Adding folder selftest/test1
--Skipped cruft file selftest/test1/.DS_Store
...etc...

To extract the zipfile just created, to another folder:

/Code/ziptools$ py3 zip-extract.py 
Zip file to extract? cmdtest/ziptest
Folder to extract in (use . for here) ? cmdtest/target
Do not localize symlinks (y=yes)? 
Retain access permissions (y=yes)? y
About to UNZIP
      cmdtest/ziptest.zip,
      to cmdtest/target,
      localizing any links,
      retaining permissions
Confirm with 'y'? y
Clean target folder first (yes=y)? y
Removing cmdtest/target/selftest
Unzipping from cmdtest/ziptest.zip to cmdtest/target
Extracted selftest/test1/
             => cmdtest/target/selftest/test1
...etc...

To list the created zipfile's contents:

/Code/ziptools> zip-list.py
Zipfile to list? cmdtest/ziptest.zip
File Name                                             Modified             Size
selftest/test1/                                2016-10-02 09:01:58            0
selftest/test1/d1/                             2016-09-30 16:41:12            0
selftest/test1/d1/fa1.txt                      2014-02-07 16:38:58            0
selftest/test1/d3/                             2016-10-02 09:05:02            0
selftest/test1/d3/.htaccess                    2015-03-31 16:55:44          271
...etc...

To extract using absolute paths, on Unix and Windows:

/...$ py3 /Code/ziptools/zip-extract.py 
Zip file to extract? /Users/blue/Desktop/website.zip
Folder to extract in (use . for here) ? /Users/blue/Desktop/temp/website
Do not localize symlinks (y=yes)? n
Retain access permissions (y=yes)? n
About to UNZIP
      /Users/blue/Desktop/website.zip,
      to /Users/blue/Desktop/temp/website,
      localizing any links,
      not retaining permissions
Confirm with 'y'? y
...etc...

c:\...> py -3 C:\Code\ziptools\zip-extract.py 
Zip file to extract? C:\Users\me\Desktop\website.zip
Folder to extract in (use . for here) ? C:\Users\me\Desktop\temp\website
Do not localize symlinks (y=yes)? n
Retain access permissions (y=yes)? n
About to UNZIP
      C:\Users\me\Desktop\website.zip,
      to C:\Users\me\Desktop\temp\website,
      localizing any links,
      not retaining permissions
Confirm with 'y'? y
...etc...

Portability

This section provides the full story on running ziptools in different contexts. In short:

The net result is that ziptools can serve as your go-to tool for managing zip archives on all your devices. As examples, check out ziptools running on Mac OS, Windows, Linux, and Android (if you haven't by now).

That said, interoperability is rarely perfect, and the following sections enumerate the footnotes that apply. Some of these can be mitigated by choosing an appropriate Python (as source code, ziptools is especially vulnerable to version skew). Most, however, reflect immutable constraints of other tools, platforms, or Pythons.

Other Tools

ziptools is broadly interoperable with other zip tools: its zipfiles can generally be unzipped by other tools, and it can usually unzip zipfiles created by other tools. As an example, zips created by ziptools running on Python 2.X are correctly unzipped by ziptools under 3.X—and vice versa. Thanks to the zip standard, ziptools' has also been verified to play well with Unix command-line zip and unzip, Finder zips on Mac OS, explorer zips on Windows, and an assortment of other tools.

The chief caveat here is that some tools may not support ziptools' features as fully as ziptools. In particular, modtimes, symlinks and permissions may not propagate as well; large files may not be supported by some unzips, per above; and ziptools' UTC-timestamp modtimes may be ignored, yielding DST skew. For best results, use ziptools on both the zip and unzip ends of your archive transfers.

Platforms

As suggested by some of the examples earlier in this guide, ziptools works equally well on Unix and Windows. It's been recently verified on Windows 7 and 10, Mac OS El Capitan and later, Ubuntu Linux, and Android 7 through 10 (including Nougat and Pie). Nevertheless, a handful of well-known platform idiosyncrasies can impact zip results:

On Unix
ziptools works well on Unix—and, by extension, Linux—with no notable caveats. Many of ziptools' extensions deal with Unix-oriented tools (e.g., symlinks and permissions), whose support is naturally best on their home base. In fact, because the fit is so good, this guide's examples are all run on Mac OS Unix, unless noted otherwise.

Still, Unix inherits a legacy Windows constraint: by spec, zip archives mimic the "local time" modtime scheme of MS-DOS and Windows FAT, instead of using Unix UTC time. This skews times across timezone changes, and can yield different time results from different unzip tools across DST changes. ziptools 1.1 deferred to Python libraries for DST changes and did not address timezones, but 1.2 now accommodates both by saving UTC timestamps to zipfile extra fields. See the earlier coverage of the 1.2 change, and search for "DST" in ziptools.py for more on this topic.

A footnote on Linux: although it can largely be lumped in with Unix in all ziptools regards, it does differ in one arguably trivial way: as tested, Linux silently fails to set symlink permissions, leaving them a fixed 0o777. This also happens when changed from a shell chmod, and means ziptools can't propagate symlink permissions (only) on this platform. Linux also doesn't recognize symlinks created on USB drives by Mac OS, but the latter may be "cheating" with custom formats. For more on the Linux ziptools story, see its Ubuntu demo session.

On Windows
ziptools works well on Windows—and even has Windows-specific support for long paths and command-line argument globbing. Some utility is limited, however, and some tools require extra steps. Unix-style permissions propagation, for example, translates only to Windows read-only flags and is otherwise irrelevant. An item with permissions 0o444 (read-only) on Unix will unzip on Windows as read-only, and return to unzip on Unix as 0o444, but all other Unix permissions will be lost in transit. See this demo for a Windows round trip's before and after.

Moreover, symlinks on Windows work only on NTFS drives, and even then unzipping (and hence creating) symlinks at all requires Windows admin permission. To enable this permission, right-click Command Prompt and select "Run as administrator," or enable Developer Mode in recent Windows 10s. For more background, see section "Symlinks—Copied, not Followed" in Mergeall's User Guide here. In ziptools, symlinks survive zips from and unzips to NTFS drives, but they are lost elsewhere, and symlink permission failures on unzips generate a message and stub file but do not end the run (see the demo).

Some Unix-focused tools also have uneven support on Windows. Symlinks to files and folders work on Windows in ziptools, for instance, but under Python 3.X only, and even then do not propagate modtimes or permissions. Python 2.X doesn't support symlinks on Windows at all, so ziptools does what it can: on unzips, symlinks are replaced with stub files so that the rest of the archive can be extracted; on zips, symlinks are followed because Python 2.X's library does not detect them, resulting in unavoidable data copies. The only remedy for these is Python 3.X.

On Android
Android is essentially Unix (really, Linux) with proprietary—and even extreme—access constraints: ziptools works on this platform too (e.g., in the Termux and Pydroid 3 apps' command lines), but you must zip and unzip to and from folders accessible to your Python app only. Unlike in Unix and Windows, file storage in Android is a fenced-in resource on unrooted devices. In particular, file writes required for ziptools may be available only in app-specific folders, but the rules have varied across storage mediums and Android releases, are scheduled to change again soon, and are too complex to cover here; visit this page for more on which folders are open to zips.

In terms of specific features, symlink creation normally fails on Android today with an "Operation not permitted" permissions error, due to this platform's emulated filesystem stack. This happens both in ziptools and otherwise; here's the error on unrooted Pie and Nougat. See the searches here and here and the related note here for background on this topic. As on Windows, ziptools on Android replaces such failing symlinks with dummy stub files, so unzips can proceed to extract an archive's non-link items.

For similar filesystem-emulation reasons, Android also silently refuses to support permissions propagation in ziptools or otherwise on unrooted Pie, and throws errors when propagating modtimes and permissions prior to Oreo. Per the same note, the pre-Oreo modtime constraint has been lifted in recent Androids by replacing a FUSE filesystem implementation, but undoubtedly still impacts many users of older devices. In ziptools, errors don't stop unzips on Android, but permissions won't propagate, and modtimes won't propagate on 2016's Nougat and earlier. Study the Nougat and Pie demo sessions to learn more.

Other factors not called out in the foregoing list may also place hurdles in the way of cross-platform content. On unzips, for instance, illegal (non-portable) characters in most filenames are munged to _ on Windows, and cause symlinks to be skipped with error messages in general. In addition, unzipping to drives using some filesystems, including exFAT, may strip archives of their Unix permissions. See the earlier notes for more details on the latter, and always test ziptools in your use case before adopting it broadly.

As a policy: both zips and unzips in ziptools skip propagation failures that involve metadata, and continue to process the rest of the archive with a message and minor metadata loss. This includes failures for permissions, modtimes, and symlinks; in the latter case, a stub file or forged link may replace a failed symlink. By contrast, ziptools throws an error and stops the run when a file or folder cannot be zipped or unzipped, because this is a core data loss. In this case, you must address the cause of the failure (e.g., permissions).

Lest this all sound too dire, keep in mind that archives which stick to basic files and folders are immune to most of the interoperability concerns noted here. Where that's impossible, archiving is usually better when Unix-oriented tools like symlinks and permissions stay on Unix (and the nod to Unix compatibility on Windows and Android is recognized as half-hearted at best).

What about iOS?: ziptools probably can also be used on iOS devices from a Python app (e.g., Pythonista, with either its Python 3.X or 2.X), but this hasn't been tested; is subject to iOS's extreme access constraints; and may require either program-call mode, or a shim wrapper to spawn a ziptools command line. iOS storage has also historically been less open on unrooted phones than Android; iOS 13 adds some filesystem functionality in its Files app (including Mac OS-like quickviews and support for zips), but the scope of content is still limited. That said, Android seems bent on imposing similar restrictions, so your mobile mileage may vary.

Pythons

Despite the prior section's critiques, platforms are not the only source of portability limits for ziptools. Because Python's support for some tools can vary by its version number even on the same platform, this section summarizes portability rules from the perspective of Python itself. Being coded in Python and shipped as source, ziptools naturally inherits these same constraints.

Broadly speaking, ziptools' most recent release has been verified to work on Pythons 2.7 through 3.7 and the zipfile modules these Pythons include; later Pythons are expected to run ziptools correctly too unless Python or its zipfile change incompatibly. ziptools also works almost identically on Python 3.X and 2.X, and the two Pythons perform equally well in almost all regards—including support for non-ASCII filenames in 2.X as of 1.1.

Still, Python 3.X holds a slight advantage for archives containing symlinks on Unix, and Python 3.X is fully required to retain symlinks on Windows. In complete detail, Python's current symlink support across platforms is both convoluted and mixed:

On Unix
On Unix (and its kind), Python 3.X can read and write symlinks, and can propagate their permissions and modtimes. Python 2.X can read and write symlinks, and can propagate their permissions but not their modtimes.
On Windows
As noted earlier, Python 3.X can read and write symlinks on Windows, but is unable to propagate either their modtimes or their permissions. Python 2.X cannot read, write, or even detect symlinks on Windows, and hence does not support them there at all. Additionally, only Python 3.2 and later can detect folder-link cycles on Windows, though very few Python 3.0 or 3.1 installs likely see action today.
On Android
Per the prior section, platform prohibitions derail symlinks in Python 3.X; though untested, the same fate likely applies to 2.X.

In sum, if you do not zip symlinks, Python 2.X is as good a choice as 3.X; if you do, 3.X is a better option everywhere. As noted in the prior section, it's also generally advised to maintain your symlinks on Unix, though Python 3.X makes them as portable as possible if they cross platform boundaries. By contrast, permissions may be lost altogether if round tripped between some platforms—in any Python.

It's important to note that, despite all the platform and Python limitations outlined in the last two sections, zipfiles are still both practical and widely useful, and serve as the basis for countless content archives and media packages. In computing, the line between unusable and workable is always up to practice to define.

See Also

For more information on portability:

ziptools.py...
has more info on Python's symlink constraints, including a version/platform support table for developer reference; see its section titled PYTHON SYMLINKS SUPPORT.
py-2.X-3.X-zipoff.txt...
demos ziptools' results under Python 2.X on Unix, and compares them to Python 3.X (spoiler: they're identical, sans symlink modtimes).
py-2.X-fixes.txt...
chronicles ziptools 1.1's fixes for non-ASCII filenames under Python 2.X. Prior to 1.1, zipfiles made under Python 2.X could yield munged non-ASCII filenames in some unzip contexts on both Unix and Windows. 1.1 solved this by forcing filenames to Unicode in creates, to invoke encoding in 2.X's zipfile that is more interoperable (e.g., with 3.X unzips). 1.1 also worked around a related 2.2 exception when printing non-ASCII to a pipe (only!).

Versions

This section briefly summarizes the highlights of ziptools versions. Search for "[N.M]" in code files to see changes applied by recent releases (e.g., 1.1 changes are tagged with "[1.1]").

Package dates versus release dates: if the date on your ziptools download package is later than the latest release date listed here, it just means that trivial non-code changes were applied. This includes the inevitable doc-typo fixes, as well as minor doc updates like the sidebars recently inserted here, here, amd here. Actual code changes will always trigger a new release number and date.

Future Ideas

There currently are no functionality upgrades scheduled. Given ziptools' recent features growth, though, it might be useful to package its main scripts as frozen executables for major platforms.

By bundling a specific Python 3.X in such an executable, this would remove an entire category of portability issues (see the preceding section). It would also simplify installs, and make ziptools immune to future changes in Python and its zipfile module. The latter is especially a concern, given the UTC-timestamp implementation's tight dependence on the current coding structure of zipfile (see zipmodtimeutc.py).

On the other hand, source code must still be shipped for program-library-mode use; this may not be seamless or possible for command-line scripts on some platforms (e.g., Android); and the results may be inferior in some contexts (e.g., Windows single-file executables may require an SSD to offset slow start-up). For now, install a Python separately as needed, and watch this space for new developments. Of course, a portable GUI may be nice too; scope creep happens...

Version 1.2: April 11, 2020

This release was a quick follow-up to 1.1. It added two major functionality upgrades that came online faster than anticipated, and extended 1.1's platform-specific reach. See folder docetc/1.2-upgrades/ for demos and screenshots of these 1.2 enhancements:

UTC timestamps
Neutralize DST and timezones in modtimes with UTC timestamps (above)
Alternate zip paths
Record items at an alternative path given a -zip@path argument (above)
More Unicode prints
Use 1.1's text munging to avoid aborts in interactive mode on Windows (code)
Android modtime skips
Modtimes can't be changed till Oreo: ignore errors on earlier versions (above)

Version 1.1: April 2, 2020

This release introduced functionality upgrades and fixes for both core and platform-specific support. See folder docetc/1.1-upgrades/ for demos and screenshots of these 1.1 enhancements:

Permissions
Extracts propagate Unix-style permissions for all items on request (above)
Symlinks
Creates save per-link permissions for symlinks, instead of a constant (code)
Auto-globs
The create script expands source-item wildcard patterns on all platforms (above)
Interface
The create and extract scripts have better error detection and reporting (demo)
Statistics
Extracts and creates both return item counts, printed by command scripts (demo)
Symlinks fix
Extracts fix an obscure abort: allow extracting unnested symlinks to . (code)
Python 2.X fixes
2.X zips non-ASCII filenames interoperably, and avoids a print exception (above)
Mac OS exFAT fix
Extracts work around a Mac OS exFAT bug: force folder modtime updates (shot)
Windows fixes
ziptools handles absent symlink support in 2.X, and non-ASCII prints (code)
Android fixes
ziptools handles symlink permission errors in emulated filesystems (code)
HTML docs
This README is HTML instead of plain text, for usability and readability (olde)

Version 1.0: June 12, 2017

The initial version, developed for and released with the Mergeall 3.0 package, but now available and developed separately (a software spin-off of sorts). See folders selftest/, cmdtest/, and moretests/ for demos of 1.0 (and later) usage.



[Python Logo] Top Code Docs Test Page Apps Input ©M.Lutz