[icon]

ziptools — The Bits That Python's zipfile Forgot

Summary:  Portable and powerful tools for zipping and unzipping zipfiles
Version:  1.3, October 28, 2021 (changes)
License:  Provided freely, but with no warranties of any kind
Author:  © M. Lutz (learning-python.com) 2017-2021
Run with:  Python 3.X or 2.X on macOS, Windows, Linux, Android, etc.
Install:  Unzip the download, no third-party libraries are required
Web page:  learning-python.com/ziptools.html

This source-code package wraps Python's zipfile module to simplify common usage, and extends it with crucial extra utility—including support for adding entire folders, propagating modtimes and permissions, archiving symlinks on Unix and Windows, omitting cruft files, using long paths on Windows, and neutralizing the DST and timezone problems of zip modtimes.

In addition, ziptools is dual-mode software, providing:

Both foster flexible and cross-platform content management using zipfiles, above and beyond Python's standard library.

This file is ziptools' first-level user guide; its source-code files here and here provide additional and lower-level details. If you have used ziptools in the past, see Versions at the end of this file for recent updates. Contents of this guide:

Quickstart

This section presents command formats and examples, as a quick introduction and reference. Throughout this guide:

General Command Formats

This package provides both scripts and library calls that create (zip) and extract (unzip) zipfiles. Its scripts are run by Python command lines, and its library calls are run by other programs using a Python module. This section summarizes both usage modes.

In the following command-line formats, items enclosed in [] are optional, and ... means zero or more:

[py] zip-create.py [zipfile source [source...] [-skipcruft] [-atlinks] [-zip@path] [-nocompress]]

[py] zip-extract.py [zipfile [unzipto] [-nofixlinks] [-permissions] [-nomangle]]

[py] zip-list.py [zipfile]

Where:

In program-library mode, most function arguments correspond to command-line switches:

createzipfile(zipname, [addnames],
              storedirs=True, cruftpatts={}, atlinks=False, trace=print, zipat=None, nocompress=False)
                     
extractzipfile(zipname, pathto='.',
               nofixlinks=False, trace=print, permissions=False, nomangle=False)

Command-Line Examples

Run the following commands in a console, script, or supporting IDE:

$ py $Z/zip-create.py myarchive.zip mycontent -skipcruft
      # Store all of folder mycontent in new zipfile myarchive.zip

$ py $Z/zip-extract.py myarchive.zip myunzipdir
      # Extract the contents of myarchive.zip to folder myunzipdir

$ py $Z/zip-extract.py myarchive.zip . -permissions
      # Ditto, but extract to '.' and propagate Unix-style permissions 

$ py $Z/zip-create.py ../myarchive.zip * -skipcruft
      # Store folder contents as top-level items, on Unix or Windows 

$ py $Z/zip-create.py mysourcecode.zip page.js gui.pyw *.py *.c
      # Store individual files as top-level items, on Unix or Windows 

$ py $Z/zip-create.py mysourcecode.zip src/latest/*.py -zip@.
      # Store individual files as top-level items, on Unix or Windows 

$ py $Z/zip-extract.py mysourcecode.zip code -nomangle
      # Extract to a folder, no "_" replacements (run fixer script) 

$ py $Z/zip-list.py myarchive.zip
      # List the contents of zipfile myarchive.zip

$ py $Z/zip-create.py
$ py $Z/zip-extract.py
      # Interactive mode: input parameters at prompts (see ahead)

Program-Library Examples

Run the following Python code samples in a script or at the >>> interactive prompt; they correspond to command lines above:

$ export PYTHONPATH=$Z:$PYTHONPATH
$ py
>>> import ziptools, glob
>>> cruftdflt = ziptools.cruft_skip_keep

>>> ziptools.createzipfile('myarchive.zip', ['mycontent'], cruftpatts=cruftdflt)

>>> ziptools.extractzipfile('myarchive.zip', pathto='myunzipdir')

>>> ziptools.extractzipfile('myarchive.zip', pathto='myunzipdir', permissions=True)

>>> ziptools.createzipfile('../myarchive.zip', glob.glob('*'), cruftpatts=cruftdflt)

Related-Tool Examples

You may find some Mergeall scripts, available here, useful for verifying this package's results. In the following example command lines, $C is your source-code install folder:

$ py $C/mergeall/mergeall.py mycontent myunzipdir/mycontent -report -skipcruft
      # Verify result modtimes using Mergeall

$ py $C/mergeall/diffall.py  mycontent myunzipdir/mycontent -skipcruft
      # Verify result content using Mergeall

Features

This section explains why you may want to use ziptools, and describes its features. In brief, Python's standard-library zipfile module does great low-level work, but the ziptools package adds both much-needed features and higher-level access points, and documents some largely undocumented dark corners of Python's zip support along the way. Among its augmentations, ziptools:

  1. Adds entire folder trees to zipfiles automatically
  2. Propagates original modtimes for files, folders, and links
  3. Can either include or skip system "cruft" files on request
  4. Propagates symlinks to files and folders on Unix and Windows
  5. Supports long paths to items on all Windows in all Pythons
  6. Propagates Unix file-access permissions for all items on request
  7. Fixes DST and timezone skew in modtimes with UTC timestamps
  8. Makes nonportable-filename mods explicit, optional, and avoidable

Although Python's shutil module has basic zipfile wrappers that add folders (make_archive) and extract all items (unpack_archive), they don't do anything about modtimes, permissions, symlinks, system cruft, or long Windows paths supported here, and don't address the limits of zip's local times or call out the perils of name mangling in Python's zipfile module. The next sections describe how ziptools does better on all these fronts.

Folder Trees

With ziptools, folder trees are added to zipfiles as a whole automatically with extra code—a sorely missed feature of Python's standard modules. Folder trees are also automatically extracted from zipfiles in full, and no user steps are required to enable folder-tree zips and unzips.

For folder-tree implementation details, see the create functions in ziptools.py.

Modtime Propagation

With ziptools, latest-modification times (a.k.a. modtimes) are automatically propagated to and from zipfiles for all items—files, folders, and symlinks. This is another glaring omission in Python's standard module, which stores modtimes in zips but ignores them on unzips. modtimes are crucial when unzipped results are used with tools that rely on file-modification timestamps—including the Mergeall program's incremental backups, and most source-control systems' change detection.

No user actions are necessary to enable modtime propagation. As of 1.2, this feature has grown even more useful with 1.2's UTC timestamps described ahead; this neutralizes the limits inherent in the zip standard's local time. For the implementation of basic modtime propagation, see the extract function in ziptools.py.

Cruft-File Skipping

With ziptools, platform-specific metadata items (a.k.a. cruft) can be automatically omitted from cross-platform zipfile archives. When enabled in creates, this ensures that these items won't wind up as clutter in contexts where they serve no purpose.

This feature is optional and configurable, and works for files, folders, and symlinks. To omit cruft items in zips, use the -skipcruft command-line switch or corresponding function argument. As shipped, this avoids propagating .DS_Store Finder files and ._* Apple-double resource forks on macOS; Desktop.ini files on Windows; .Trash* and other recycle bins; and most other .* files hidden per Unix convention which don't belong in zipped content.

For more control, you can also define what to skip in your use case. Cruft is identified with skip and keep filename patterns—either a custom set, or a provided general-purpose default used automatically by the command-line scripts. See zip-create.py and ziptools.py for more details, and zipcruft.py for the default-but-changeable cruft patterns shipped.

Symlinks Propagation

With ziptools, symbolic links (a.k.a. symlinks) to both files and folders are supported and propagated on both Unix and Windows. By default, symlinks are always copied verbatim to and from zipfiles, and their path separators are made portable along the way. No user action is required to enable this behavior.

Optionally, clients may also elect to instead copy the items which links reference with -atlinks and can opt out of link portability transforms with -nofixlinks (or their corresponding function arguments). In more detail:

Subtly, /-to-\ adjustment is required when unzipping symlinks on Windows, even though most Windows APIs allow the two to be used interchangeably. On Windows 11 in Python 3.X:
>>> os.symlink(r'temp\afile.txt', 'alink-n-w')     # Windows path, like mklink
>>> os.symlink(r'temp/afile.txt', 'alink-n-u')     # Unix path (e.g., zipped)

>>> os.readlink('alink-n-w')                       # read link path, \\ is \
'temp\\afile.txt'
>>> os.readlink('alink-n-u')                       # Unix path created okay
'temp/afile.txt'

>>> open('alink-n-w').read()                       # follow link path
'hello nested'
>>> open('alink-n-u').read()                       # Unix path fails (Explorer too)
OSError: [Errno 22] Invalid argument: 'alink-n-u'
Hence, all path-separators in symlink paths are by default converted to the unzip platform's flavor for portability. This generally makes symlinks portable, though link paths must also exist on the unzip host (e.g., absolute paths may not work), and platform-specific syntax must be recognized on the unzip target (e.g., Windows drive letters will fail on Unix).

Scope: symlinks that mind these rules do survive round trips between Unix and Windows, but their support is uneven across platforms and Pythons. Windows symlinks, for example, do not retain their original modtimes on unzips in 3.X, and their permissions are largely moot. Moreover, unzips require admin permissions to create symlinks on Windows (until Windows 10/11); Android may not support symlink creation at all, and leave stub files on unzips; and Python's symlink support varies by version. See Portability ahead for more fine-grained details on symlink interoperability.

For a demo of ziptools' symlinks support, see the console log here. For symlink implementation details, see zipsymlinks.py. See also zip-create.py for more on -atlinks; zip-extract.py for more on -nofixlinks; and ziptools.py for more on both usage options.

Long Paths on Windows

With ziptools, filesystem pathnames of all items on Windows are allowed to exceed their normal 260/248-character length limit on that platform. This is accomplished by automatically prefixing paths with \\?\ whenever they are passed to tools in the underlying Python zipfile module, and other file-related interfaces.

No user action is required for this feature. On all versions of Windows, it supports files, folders, and symlinks at long Windows paths both when adding to and extracting from zip archives. Among other things, this is useful for unzipping and rezipping long-path items originally zipped on more-flexible Unix hosts, where this feature is unneeded and unused.

For implementation details, see ziplongpaths.py.

Windows path-limit install option in Python 3.6+: on recent Windows 10 systems, the standard python.org installers for Python 3.6 and later include an option to automatically remove the Windows platform's former path-length limit. ziptools, and its Mergeall cousin, instead use the manual but more-inclusive technique described here to lift the limit for users of all Pythons and all Windows. While useful, the Python 3.6+ enhancement is optional and easy to miss; won't apply to frozen executables; and doesn't help those using Windows 7 and 8, or Pythons 2.X through 3.5—still-substantial audiences all. For more details, see this online usage note.

Permissions Propagation (New in 1.1)

With ziptools, Unix-style permissions for all items—files, folders, and symlinks—can be propagated both to and from zipfiles, and hence survive a zip and unzip. Specifically:

Due to interoperability issues, this new extract option should generally be used only when unzipping from zipfiles known to have originated on Unix, and when unzipping back to Unix. Most use cases that require permissions to survive trips to/from zips probably satisfy this rule, but propagation isn't enabled by default because origin cannot reliably be determined. Propagation is generally harmless where Unix-style permissions are not supported: it doesn't abort ziptools zips or unzips, though results may lose permissions per the following note.

Scope: not all filesystems and platforms support Unix-style permissions. On exFAT, for instance, permission updates silently change nothing, and the extract option has no effect. It's okay to copy a zipfile to and from an exFAT drive as a whole, but don't unzip on exFAT if you care about retaining Unix permissions. Unix-style permissions also don't fully translate to Windows, and may not work at all on Android. See Portability ahead for full details, and don't send permissions on round trips to platforms that don't support them.

For a demo of 1.1 permissions propagation, see the screenshot here and the console logs here and here. For implementation details, see ziptools.py and zipsymlinks.py.

UTC Timestamps (New in 1.2)

With ziptools, the modtimes of files, folders, and symlinks are automatically made immune to changes in both timezone and DST (Daylight Savings Time). This is achieved by storing UTC (a.k.a. GMT) timestamps in one of the "extra fields" defined by the zipfile standard. This feature is always applied, and no user action is required to enable it. In more detail:

This is a full fix to zip's local-time issues: because UTC timestamps are relative to a fixed point, they are agnostic to both timezone and DST changes. By contrast, local time is scaled for both, and may be better used for display. The zip standard's use of local time for file storage, however, makes it difficult to adjust for DST, and impossible to adjust for timezones—especially given zip's lack of timezone information (if you fly three timezones away, zip local times won't change with you). UTC is the only way to keep zipped modtimes comparable in all contexts with the original data.

Prior to 1.2, ziptools used the zipfile's main local time, and deferred to Python's library tools here and here to translate UTC time to and from local time and accommodate DST changes. That scheme's modtime results could vary from those of some other zip tools after DST changes, however, and did nothing about timezone changes. The new UTC timestamp scheme in 1.2 eliminates both DST and timezone modtime skew in a single step.

Interoperability: not all zip tools use or record the extra field added by ziptools 1.2, but the absence of support is harmless. When processing zipfiles with the new field, tools that don't recognize it will simply skip it, per zip standard. When processing zipfiles without the new field, ziptools will fall back on its original and subpar local-time scheme. More positively, ziptools will use UTC-timestamp modtimes recorded by other tools in the same fashion as its own.

For a demo of 1.2 UTC timestamps in action, see the console log here, and the screenshot here. For implementation details, see zipmodtimeutc.py, and its ziptools.py and zipsymlinks.py clients.

Nonportable Filenames (New in 1.3)

Filenames containing any of the characters / \ | < > ? * : " are nonportable, and cannot be saved on some filesystems. In ziptools, unzips (extracts) by default automatically change names of files, folders, and symlinks as needed so they can be saved. These changes, known as mangling, replace all nonportable filename characters with _ on unzip failures, and retry the failed save. As of version 1.3, ziptools mangling is explicit and optional: mangles are reported, and -nomangle disables its changes altogether.

Automatic filename mangling is done only on Windows, and is run for all filesystems on Windows because these rules are imposed across the platform. Mangling is not attempted for save failures on Android, or external drives written on other platforms. Android 11 shared storage emulates Windows' rules, but has a bug discussed ahead which precludes mangling. FAT32 and exFAT drives impose the same rules too, but Linux writes to such drives are allowed to fail with messages and skips to avoid perpetual diffs, and macOS silently munges nonportables to/from Unicode privates on such drives as noted ahead.

While auto-mangling may be helpful in some contexts, it can also create multiple problems later in content-management life cycles. Users are advised to instead run a new provided fixer script, fix-nonportable-filenames.py, to analyze and replace nonportable filename characters, before propagating content from Unix (e.g., macOS and Linux) to Windows, Android 11 shared storage, and external drives using filesystems which restrict filenames.

This fixer script sidesteps multiple interoperability traps. By fixing filenames at their source, it avoids potential overwrites and back-sync errors inherent in mangling; prevents recurring diffs on Linux and later network-sync errors on macOS; and can be useful for content transferred by ziptools unzips, Mergeall syncs, and other techniques (e.g., explorer copies to FAT32, exFAT, and BDR drives).

Run this script from a command line on Unix to display or repair all nonportable names in an entire folder tree:

$ python3 fix-nonportable-filenames.py folderrootpath 1    # list items only
$ python3 fix-nonportable-filenames.py folderrootpath      # list and rename items
See the script's in-file docs for more details. If you choose to let auto-mangling happen anyhow, the next two sections provide more details about ziptools' policies for it, and explain why it's not performed on the similarly constrained Android.

Mangling Transparency

Prior to version 1.3, filename mangling was performed in ziptools by Python's underlying zipfile module for all filenames, and on Windows alone. The filename changes made by that module both are silent, and have the potential to damage content in two ways:

Given these perils, filename mangling should never be mandatory and silent. To do better, ziptools 1.3 wrests control of mangling away from the Python module, and reimplements it to provide better transparency. It:

The first two of these require forcibly disabling zipfile mangling, catching save errors, manually mangling, and rerunning extracts, but this allows both user disables and reporting. Check the output after an unzip for details about any filename changes made; mangles are labeled as "--Name mangled" and skip messages as "**SKIP". Tallies for mangles and skips are also displayed at the end for runs in which they are nonzero.

The fixer script is generally recommended over ziptools' auto-mangling as noted, but auto-mangling is enabled by default because its issues are rare. Indeed, nonportable filenames themselves are generally atypical; they may crop up most for trivial web-page saves on Unix, and are best avoided as a rule. ziptools policies provide a safer alternative when these files creep into your content anyhow.

It's worth adding that the native unzip extractor run by Windows 10's own file explorer simply discards files with nonportable names silently. This is roughly the same as ziptools -nomangle option, though ziptools also reports and tallies the skips, instead of quietly dropping your content.

Android Shared Storage

Beyond its lack of transparency, Python's underlying zipfile module also mangles on Windows only. This is insufficient: shared storage on some versions of Android, for example, is implemented with a filesystem driver that emulates FAT32, and disallows nonportable characters similarly. In principle, mangling should be applied on this platform too, to enable unzip saves.

In fact, Android, being Android, implements filename rules which vary by storage type, version, and perhaps even vendor. Two Android 11 devices tested disallow Windows illegal characters in shared storage, because it uses a FUSE driver; but an Android 10 device does not, because its shared storage uses SDCardFS. Moreover, no tested device disallows Windows illegals in app-specific or app-private storage, which generally use Linux's ext4 filesystem. On Android 11 in Termux:

$ cd /sdcard                                # shared storage (fails on 11, not 10)
$ echo xxx > test\?.txt
bash: test?.txt: Operation not permitted
$ echo xxx > test\|.txt
bash: test|.txt: Operation not permitted

$ cd /sdcard/Android/data/com.termux        # app-specific storage
$ echo xxx > test\?.txt
$ echo xxx > test\|.txt
 
$ cd /data/data/com.termux                  # app-private storage
$ echo xxx > test\?.txt
$ echo xxx > test\|.txt

To handle such convolution and be universal, mangling might be attempted on all platforms when extracts fail. Supporting Android shared storage this way was explored as part of this update, until it encountered a showstopper bug which rules out mangling on the platform altogether.

In short, there's nothing that ziptools can do for this use case because Android 11 is broken: its FUSE-based shared storage disallows nonportable characters in filenames as it should, but, unlike Windows, silently allows them in folder names, leaving crippled folders. A nonportable folders path will be silently created, but no files can be stored inside it afterwards—even if they are fully legal. Because no error is raised for such folders' creation, there is nothing for ziptools to catch and respond to. In Termux on Android 11 again:

$ cd /sdcard
$ echo spam > file\|name.txt                     # illegal files fail in shared
bash: file|name.txt: Operation not permitted
$
$ mkdir dir\|name                                # but illegal folders work
$ cd dir\|name
$ pwd
/sdcard/dir|name
$
$ echo spam > file\|name.txt                     # but they don't accept files
bash: file|name.txt: Operation not permitted
$
$ echo spam > filename.txt                       # even if they are legal: bug!
bash: filename.txt: Operation not permitted
$
$ mkdir sub\|name                                # but broken subfolders are okay...
$ ls
'sub|name'

This isn't a problem on Windows, because invalid pathnames cannot be created, and raise errors as expected. Because Android 11 silently creates invalid pathnames, however, mangling must be limited to Windows alone. Else, mangling an item's path on Android would create a doppelgänger path, which parallels the nonportable and useless path made by Android. Moreover, it would be dangerous to simply remove or rename the bogus path built by Android, because it may have been created outside the scope of an unzip, and hence used arbitrarily.

To do better, ziptools would have to know if filename mangling is required before any folders are created. The third-party psutil library offers filesystem queries that might help on this front. Unfortunately, it does not support Android, per both documentation and testing: on unrooted Androids 10 and 11, the latest psutil 5.8 installed with pip in Termux yielded a permission-error exception on first import, for a failed access to /proc/stat. Later calls for disk-partitions info similarly failed on /proc/filesystems. This library is clearly unusable on Android today.

Even if psutil worked, however, it's a long shot: identifying Android version and filesystem combinations that require mangling seems nearly nondeterministic, and doing so ahead of time for Android would require duplicating zipfile's extract method in full, because it hardcodes a test for Windows alone. This is a mod too far—especially given that name mangling is on thin data-integrity ice in the first place.

Hence, failing nonportable filenames on Android are simply skipped with messages and tallies. For ziptools users hoping to unzip Unix content on Android, this leaves three work-arounds:

Change filenames in content before it is propagated
You can still use Android 11 shared storage, as long as your filenames are portable. The included fixer script, borrowed from the Mergeall system, both reports and changes nonportable characters, and can be run before transfers by users concerned about data-loss potential. Use this to both analyze your content, and weed out any filename interoperability issues before they can cause cross-platform nightmares.
Use app-specific or app-private storage on Android instead
Android's app-specific and app-private storage use Linux filesystems, and so support most filename characters without mangling. They are also generally much faster than shared storage in Android 11. Regrettably, these storage categories also have restricted access, and content in both is by default automatically deleted on app uninstall; using either for general storage is limited and perilous.
Accept a minor data loss of nonportably named content
Nonportable filename characters often arise in web-page clippings saved by Unix browsers. Especially if Android devices are used in read-only mode, loss of such content on transfers may be an acceptable sacrifice. To see what may be lost, run the fixer script's list-only mode on folders in your content tree before copies; some intended uses of nonportable characters may require explicit handling, not automatic mangling.

Of these, the first is again recommended. Before transferring content from Unix to Windows, Android shared storage, or other similarly limited contexts, run this script to sanitize it of characters that pose an interoperability hurdle. By doing this at content's point of origin, it both avoids ziptools skips on Android, and sidesteps potential overwrites and later back-sync issues inherent in ziptools' auto-mangling on Windows.

Bonus: running the fixer script also avoids similar issues in other contexts. Linux, for example, reports errors and refuses to copy nonportable filenames to Windows-filesystem drives (e.g., exFAT) in both file explorers and command lines. Worse, macOS breaks syncs by silently mangling nonportable filename characters to and from Unicode private characters (instead of "_") on FAT32 and exFAT drives, which fails when files are used outside macOS's scope; read the sordid details here. By fixing filenames at their source, you remove such variables—and unpleasant surprises—from your content copies.

Mangling Summary

To summarize ziptools' policies for nonportable filenames by platform:

When unzipping on Windows
ziptools mangles unless the -nomangle option is used, so files can be saved. If mangling is used, mangles are reported and tallied in program output. If mangling is not used, nonportable names will fail to extract, and be skipped with a message and tally in program output. Despite auto-mangling support, running the fixer script is still recommended before unzipping on Windows, to avoid the potential perils of automatic name changes.
When unzipping on Android
ziptools does not mangle, even for shared storage, for the reasons outlined in the preceding section. Because of this, running the fixer script is recommended before unzipping on Android shared storage, to avoid skips. If the fixer script is not used, nonportable names will fail to extract in some versions' shared storage, and be skipped with a message and tally in program output.
When unzipping on Unix
ziptools does not mangle, because filenames are much more liberal. Content lucky enough to live only on Unix generally doesn't need to care about any of this interoperability drama. If, however, content on Linux or macOS must ever be copied to Windows, Android, or most external drives, running the fixer script first can avoid later headaches along the content-maintenance road.

In short, ziptools reports filename changes, and performs them only on Windows. Thus, its -nomangle is relevant only on Windows, and is irrelevant even there if you run the fixer script, because there will be no names to mangle. Takeaway: except when unzipping on Unix or safe Android storage, using the script always maximizes content integrity, and should be the opening act of most cross-platform plays.

For More Mangling Details

There's more to the mangling story elsewhere:

And for demos of the sorts of filenames disallowed and mangled on Windows and Android, see also the results here and here. As these show, filename rules vary slightly between Windows and Android; Android's shared storage limits names but its app-specific storage does not; Windows limits folder names but Android shared storage does not; and exceptions raised on errors differ between the two platforms. Alas, filename interoperability seems a pipe dream, and perhaps not accidentally so.

Etcetera

Beyond the preceding sections' features, the ziptools package also provides command-line zip and unzip programs that work portably on macOS, Windows, Linux, Android, and more; runs all its code on either Python 3.X or 2.X; and comes with complete and open-source Python code that you can use, audit, and adapt.

Again, see the main zip-create.py and zip-extract.py command-line scripts for more usage details omitted here, and the ziptools.py module they use for lower-level implementation details. For an example client of ziptools command lines, see the README and scripts of this package.

Usage

This section provides expanded usage examples to demo typical ziptools roles, and documents additional usage details along the way. For more examples external to this guide, the test-case folders:

all capture usage and runs, and each script and module in this package includes in-depth documentation strings with usage details omitted here for space. In addition, the documentation folders:

demonstrate the new features in versions 1.1 and 1.2; the screenshots here and here capture basic usage; and Mergeall's test folder illustrates its related symlink support.

General Concepts

Before we get into command details, this section presents ziptools usage pointers that span modes:

Output
Both ziptools creates and extracts print one or two lines per item to standard output. The first line gives the item's source name: the file for creates, and the zipfile entry for extracts. The second line gives the destination following a =>, and appears only if the item's destination differs from its source. In creates, a second line is printed if the -zip@path switch described ahead is used to rename source folders in the zipfile. In extracts, a second line is displayed if the item's target path differs from its source in the zipfile, after mapping Windows \ to zip's /; this occurs when extracting to a folder other than .. In all cases, run output may be routed with shell | and >, and function-call trace arguments.

ziptools 1.3 changed extracts to print one or two lines, as described above. Extracts formerly printed two lines for every item for clarity, but 1.3 drops the second line when the source and target paths are the same—a common case when extracting to the current directory. In this case, the former target line is redundant; omitting it reduces extract output enormously; and the resulting behavior mimics creates. In all other cases, extracts still print two lines as before, to make source and target explicit.

Portability
All code in this package works under both Python 3.X and 2.X, and on both Unix and Windows, though some advanced tools (e.g., symlinks and permissions) are platform or Python-version dependent. ziptools' zipfiles are generally interoperable with other zip tools, and vice versa, though some features may be supported unevenly elsewhere. When in doubt, use ziptools for both zips and unzips, and see Portability ahead for complete interoperability details.
Pathnames
Items added to zip archives are recorded with their relative or absolute paths given, less any Windows drive and UNC parts, and most .. relative-path syntax. Items are later unzipped to these recorded paths relative to an unzip target folder. Leading ..s may be recorded by zips, but are stripped by unzips. Tip: to reduce the length of paths saved and restored, cd to source folders before zipping —or see the next item on this list for a 1.2 alternative.

Windows backslashes: per the zip standard, ziptools creates (zips) always use Unix / for path separators in zipfiles on all platforms, including Windows. This is ensured by the underlying zipfile module, which also replaces / with \ in paths and illegal characters with _ in filenames when extracting on Windows (the latter by default; see ziptools 1.3's mods ahead). While some Windows zip tools may erroneously record path separators as \, there is no way for ziptools extracts (unzips) to recognize these as such on Unix, because \ is also a valid filename character on Unix.

Hence, in ziptools:

  • When extracting on Unix, any Windows \ in zipfile paths are taken as part of a filename, and will not generate folders—even if they were meant to do so on Windows
  • Conversely, when extracting on Windows, all \ are taken as path separators and will generate folders—even if they were not meant to do so on Unix
The Windows extract behavior is a result of code in both ziptools itself and the underlying Python zipfile module. The Python module's behavior is a known issue, but ziptools' own code must be consistent with it because it can occur even when ziptools' code is not run (in brief, when ziptools' mangling is not required).

For finer details, search for "More on Backslashes" in ziptools.py, or try the web. If backslashes crop up in your content's filenames on Unix, your best option is to sidestep issues in full by running the fixer script before propagating to Windows with ziptools; this script changes backslashes to underscores before they can trigger unintended folders. And if backslashes show up in paths zipped on Windows, consider using another Windows zip tool that isn't an affront to both standards and interoperability.

Alternate zip paths
As of 1.2, the create -zip@path command-line switch (and corresponding function argument) allows zipped items to be recorded, and hence unzipped, under an alternate path, or none if path is .. This is useful to shorten long pathnames in zip sources and thus avoid nesting on unzips, and can make a pre-zip cd unnecessary. It can also be used as a renaming tool.

This feature is best described by example. Normally, a source item A/B/C/D is zipped at path A/B/C/D, and later unzipped at the same path, relative to (i.e., nested in) the unzip target folder. Given -zip@C at zip time, the item is instead zipped and unzipped at C/D, and using -zip@. makes D a top-level, unnested item in both zip and unzip, stripping its A/B/C path of origin.

Really, this switch merely replaces the top-level provided path with another (or none, for .) in the zipfile. A -zip@X, for example, records A/B/C/D as X/D in the zipfile and propagates it as such on unzips, thereby renaming the path of origin. When used, this feature also yields two output lines per item to show zip paths used (zip-list.py shows results too).

Two fine points: first, note that the alternate path is applied to every source item zipped, and so may be more useful when zipping either a single item or many items in the same folder (else same-named items from different folders will collide). Second, the path replacement applies only to the top-level paths given; items nested in zipped folders are still nested the same in the zip—though at a root path which has been shortened, expanded, or arbitrarily renamed by -zip@path.

For full fidelity, see zipatmunge() in module ziptools.py. For a demo of this feature at work, see the console log here. Per this demo, -zip@. can also be used to remove folder nesting for all items that match * wildcards—as described further in the next section.

Windows PowerShell users: you must quote the new switch in command lines as either "-zip@path" or '-zip@path'. PowerShell uses a proprietary scripting language which treats @ specially, and unfortunately clashes with ziptools' argument. You do not need to quote this argument on Windows in either Command Prompt; the standard bash shell in Windows 10's Linux subsystem (WSL); or the Unix shells available in the Cygwin package. Users familiar with Unix may prefer the last two of these options, captured here and here. Windows, being Windows, has long spun a convoluted and fragmented command-line tale; such is life in battling-towers development.

Source globbing
When using the create script (only), source items may use filename expansion (a.k.a. globbing) on both Unix and Windows: * matches any number of characters; ? matches any single character; [xxx] matches any character in xxx; [!xxx] matches any character not in xxx; and brackets escape literals (e.g., [?] matches ?). All items with matching names are zipped. This feature also works when arguments are prompted in interactive mode (see ahead).

As examples, a source item *.py on any platform zips all Python source files in the current folder as top-level, unnested items. Similarly, a source item folderpath/*.py zips all Python files located in another folder; use the -zip@path feature of the preceding note to shorten, change, or remove folderpath for each matching file zipped. For instance, given source item folderpath/*, -zip@. zips all the matching items as top-level, unnested items, and -zip@newfolder zips all the matching items in a different folder name (this is a folder rename). For a demo of globbing and its interplay with -zip@, see the console logs here and here.

Source .
Using . as a zip source item zips all items in the current directory as unnested items, and is functionally equivalent to a lone * to match all items (except that * omits any .* items hidden per Unix convention). This may be more useful for command lines than for function calls in program-library mode, given that callers might be running in a program folder.
Target .
Using . as the unzip target folder unzips each item in the zipfile to the current directory, with no extra folder nesting apart from that recorded in the zipfile itself (unnested items in the zip are unnested in .). Caution: this silently overwrites any same-named items in .; name another folder instead of . if you wish to avoid this.
Existence
In all modes, an error is reported for zip source items that do not exist, but unzip target folders are automatically created if they do not exist. In command-line mode, nonexistent sources are detected before a zip starts, and abort the request with an error message. In program-library mode, nonexistent sources are simply skipped with a message, because callers are responsible for input correctness.
Errors
Both zips and unzips in ziptools ignore propagation failures that involve metadata, and continue to process the rest of the item archive with a message and minor metadata loss. This includes failures for permissions, modtimes, and symlinks; in the latter case, a stub file or forged link may replace a failed symlink.

More generally, unzips (extracts) continue with a message when a single item fails, so the rest of the archive is made available (search output for "**SKIP"). By contrast, zips (creates) throw an uncaught exception and stop when an item cannot be zipped, because this is a core data loss. On zip aborts, you must address the cause of the failure (e.g., permissions).

Compression
By default, items in archives created by ziptools itself are compressed using the Python zipfile module's ZIP_DEFLATED setting. This isn't configurable (until 1.3: see the following update), but it's the usual method for zips, and generally does a reasonable job. Archives unzipped by ziptools support any compression method supported by the installed zipfile module, including ZIP_DEFLATED; see this module's docs for more details.

New uncompressed option: In ziptools 1.3, the create (zip) script adds a -nocompress command-line flag; if used, this disables the standard compression normally applied to zipped items, and stores them in the zipfile uncompressed (technically, by using the ZIP_STORED format of zipfile and the zip standard). The create function call also gains a parallel nocompress argument. In ziptools extracts, uncompressed zipfiles work automatically and require no extra flags or arguments.

While not generally required for common and smaller archives, the new create uncompressed options radically speed both zips and unzips of very large archives, at a modest cost in extra zipfile size, and no loss of data or metadata. On macOS, for example:

  • With compression, a 208.4G archive of 178.9K items took 1.5 hours to zip and 20 minutes to unzip, producing a 195.8G zipfile.
  • With compression disabled using the new switch or argument, this same archive zipped and unzipped in just 22 minutes and 5 minutes, respectively, and yielded a 208.5G zipfile.

This translates to roughly 4X faster zips and unzips for archives like that tested—and may move the needle from impractical to practical. Copying large content collections to smartphones, for example, may fail unless zipped for transit.

File size
ziptools always uses ZIP64 extensions when needed to support files much larger than zip's former limits, via this option in Python's zipfile module. While file size is practically unlimited with ZIP64, some other tools may fail or refuse to extract zipfiles larger than 2G (e.g., Unix unzip), and others may balk at 4G. If you run into limits, your best option may be to find or install a Python 2.X or 3.X on the unzip host, and run ziptools' zip-extract.py.
Performance
Short story: ziptools' zip and unzip performance is on a par with the Unix command-line zip and unzip programs. Per a recent stress test on macOS, zipping a 208G content collection of 163K files and 16K folders took 1:24 (hours:minutes) with ziptools' zip-create.py, using Python 3.8. Doing the same with the Unix zip -r command took 1:18 and yielded a zipfile 3G larger at 198G. Unzipping this archive on the same platform was faster, at 0:24 for ziptools' zip-extract.py, and 0:20 for the Unix unzip.

None of these times qualify as zippy (pun intended), and may or may not be reduced by disabling compression (see above). On the other hand, zipfiles avoid data-loss perils of filesystems, platforms, and explorers; both ziptools' and Unix's results for this test were proven correct (per the bytewise diffall.py and timestamp mergeall.py in Mergeall); and runtimes are naturally much quicker for content of smaller and more reasonable dimensions.

Update: see also the new -nocompress option described above; where speed matters more than space, this can make ziptools zips and unzips shockingly faster (4X as tested, to be precise) than the foregoing implies.

See the create and extract scripts for more on pathnames and other details in ziptools. The following sections move on to present examples by usage mode.

Program-Library Mode

ziptools can be used from both command lines shown ahead, and direct Python program calls like those demonstrated here.

In program mode only, callers are responsible for expanding any wildcard operators like * in source filenames (a.k.a. globbing), before calling ziptools' create function; use Python's glob.glob. Program mode can also leverage other Python tools to construct source lists, as is responsible for source existence per above.

Cruft skipping is enabled in program mode by passing a dictionary of skip and keep patterns to the create function's cruftpatts. This argument defaults to {} which disables skipping. To enable, pass either a custom definition or the default imported from zipcruft.py.

As usual in Python for simple source-code zips, ziptools' install (unzip) folder must be on your module search path. An export command on Unix or similar set on Windows in your shell start-up files generally suffices, though other options abound; your platform's documentation and other resources can provide more details.

See ziptools.py for more on program usage and the arguments demonstrated here, as well as additional arguments omitted here for space:

# Setup (Z is ziptools' install folder)
$ export PYTHONPATH=$Z:$PYTHONPATH    # Unix
$ set PYTHONPATH=%Z%;%PYTHONPATH%     # Windows

# Basic usage
import ziptools
ziptools.createzipfile(zipto, sources)
ziptools.extractzipfile(zipfrom, unzipto)

# Test folders
ziptools.createzipfile('test-1-2.zip', ['test1', 'test2'])
ziptools.extractzipfile('test-1-2.zip', '.')

# Websites, with cruft skips
from ziptools.zipcruft import cruft_skip_keep
ziptools.createzipfile('website.zip', ['website'], cruftpatts=cruft_skip_keep)
ziptools.extractzipfile('website.zip', '~/public_html', permissions=True)

# Development
ziptools.createzipfile('devtree.zip', ['dev'], trace=(lambda *p, **k: None))
ziptools.extractzipfile('devtree.zip', '.', permissions=True)

# Symlink options
ziptools.createzipfile('filledintree.zip', ['skeleton'], atlinks=True)
ziptools.extractzipfile('nonportable_devtree.zip', '.', nofixlinks=True)

# Manual globs
from glob import glob
ziptools.createzipfile('allsourcecode.zip', glob('*.py') + glob('*.c'))

# Use 1.2 alternate path to minimize or eliminate folder nesting in zips
ziptools.createzipfile('folder.zip', ['long/path/to/folder'], zipat='to')
ziptools.createzipfile('pycode.zip' glob('some/other/folder/*.py'), zipat='.')

# Use 1.3 compression disable to speed zip and unzip of large archives
ziptools.createzipfile('folder.zip', ['folder'], cruftpatts=cruft_skip_keep, nocompress=true)

Command-Line Mode

The following sorts of ziptools command lines may be run from your system's command shell (e.g., Terminal on Unix, Command Prompt on Windows, or Termux on Android); another program (e.g., using Python's os.popen); or IDEs that support system command lines (e.g., PyEdit).

In all cases and on all platforms, ziptools' create script expands filename wildcards (e.g., * and ?) not already expanded by a command shell, and . in creates and extracts refers to the current directory. See the prior section for more on expansion and dots.

For brevity and convenience, most of these examples use a Unix shell-variable reference $Z and assume that this variable has been set by an export or similar prior to script runs (e.g., $Z/zip-create.py). In Windows Command Prompt, a %Z% after running a set command may be used instead (e.g., %Z%\zip-create.py)

See zip-create.py and zip-extract.py for more on the command lines and switches demonstrated here:

# Setup (optional)
$ export Z=ziptoolspath    # Unix
$ set Z=ziptoolspath       # Windows

# Test folders
c:\...\ziptools> zip-create.py cmdtest\ziptest.zip selftest\test1 selftest\test2
c:\...\ziptools> zip-list.py cmdtest\ziptest.zip
c:\...\ziptools> zip-extract.py cmdtest\ziptest.zip cmdtest\unzipped

# Websites
...local$  python3 $Z/zip-create.py ~/website.zip . -skipcruft
...remote$ python2 $Z/zip-extract.py ~/website.zip public_html -permissions

# Distributions
...devdir$ python3 $Z/zip-create.py program.zip programdir -skipcruft
...usedir$ python3 $Z/zip-extract.py program.zip .

# Development
...dir1$ python $Z/zip-create.py devtree.zip dev -skipcruft > /dev/null
...dir2$ python $Z/zip-extract.py devtree.zip . -permissions

# Special cases: populating from links, retaining link separators
...dir1$ python $Z/zip-create.py devtree.zip dev -skipcruft -atlinks
...dir2$ python $Z/zip-extract.py devtree.zip . -nofixlinks

# Individual items
...here$ python $Z/zip-create.py allcode.zip a.py b.py c.py d.py
...here$ python $Z/zip-create.py allcode.zip a.py b.py folder -skipcruft

# Shell pattern expansion: supported on all platforms in [1.1]
...here$ python $Z/zip-create.py allcode.zip *.py
...here$ python $Z/zip-create.py allcode.zip *.py test[12].txt doc?.html

# Use items in a folder as top-level items, not nested in their folder
--cd source dir
...src$ python $Z/zip-create.py ../allcode.zip * -skipcruft
--cd dest dir, copy allcode.zip to .
...dst$ python $Z/zip-extract.py allcode.zip . -permissions

# Use 1.2 alternate path to minimize or eliminate folder nesting in zips
...here$ python $Z/zip-create.py folder.zip long/path/to/folder -zip@to -skipcruft
...here$ python $Z/zip-create.py pycode.zip some/other/folder/*.py -zip@.

# Use 1.3 compression disable to speed zip and unzip of large archives
...here$ python $Z/zip-create.py folder.zip folder -skipcruft -nocompress

Interactive Mode

When ziptools is run from a command line with no arguments, it falls back on asking for inputs interactively, as in the examples that follow. Inputs normally come from a user at the console, but may also be taken from a piped-in replies file or shell 'here' document (though command-line arguments work too; for background, try this). In these listings, substitute \ for all / when working on Windows. In this mode, expansion and dots work the same as they do in command lines.

For both creates and extracts, interactive mode also asks you to verify the run before starting it, and a control+c keypress combination at any console prompt terminates the session without starting a zip or unzip. For extracts, interactive mode (only) also asks if you wish to clean (i.e., delete) the contents of the target folder first, if the folder already exists. If you opt to not clean, the unzip will add to the folder's current content.

To extract an existing zipfile to ., the current directory:

.../test-symlinks$ $Z/zip-extract.py
Running ziptools 1.3
Zip file to extract? save-test1-test2.zip
Folder to extract in (use . for here) ? .
Do not localize symlinks (y=yes)? 
Retain access permissions (y=yes)? 
Do not mangle filenames (y=yes)? 
About to UNZIP
      save-test1-test2.zip,
      to .,
      localizing any links,
      not retaining permissions,
      mangling filenames
Confirm with 'y'? y
Clean target folder first (yes=y)? n
Unzipping from save-test1-test2.zip to .
Extracted test1/
...etc...

To create a new zipfile in a folder, from items in two other folders (run in ziptools' own folder for variety):

/Code/ziptools$ zip-create.py
Running ziptools 1.3
Zip file to create? cmdtest/ziptest
Items to zip (comma separated)? selftest/test1, selftest/test2                   
Skip cruft items (y=yes)? y
Follow links to targets (y=yes)? n
Alternate zip path (.=unnested, enter=none) ? 
Disable item compression? (y=yes)? 
About to ZIP
      ['selftest/test1', 'selftest/test2'],
      to cmdtest/ziptest.zip,
      skipping cruft,
      not following links,
      zip@ path (unused),
      compressing items
Confirm with 'y'? y
Zipping ['selftest/test1', 'selftest/test2'] to cmdtest/ziptest.zip
Cruft patterns: {'skip': ['.*', '[dD]esktop.ini', 'Thumbs.db', '~*', '$*', '*.py[co]'], 'keep': ['.htaccess']}
Adding folder selftest/test1
--Skipped cruft file selftest/test1/.DS_Store
...etc...

To extract the zipfile just created, to another folder:

/Code/ziptools$ py3 zip-extract.py
Running ziptools 1.3
Zip file to extract? cmdtest/ziptest
Folder to extract in (use . for here) ? cmdtest/target
Do not localize symlinks (y=yes)? 
Retain access permissions (y=yes)? y
Do not mangle filenames (y=yes)? y
About to UNZIP
      cmdtest/ziptest.zip,
      to cmdtest/target,
      localizing any links,
      retaining permissions
      not mangling filenames
Confirm with 'y'? y
Clean target folder first (yes=y)? y
Removing cmdtest/target/selftest
Unzipping from cmdtest/ziptest.zip to cmdtest/target
Extracted selftest/test1/
             => cmdtest/target/selftest/test1
...etc...

To list the created zipfile's contents:

/Code/ziptools> zip-list.py
Running ziptools 1.3
Zipfile to list? cmdtest/ziptest.zip
File Name                                             Modified             Size
selftest/test1/                                2016-10-02 09:01:58            0
selftest/test1/d1/                             2016-09-30 16:41:12            0
selftest/test1/d1/fa1.txt                      2014-02-07 16:38:58            0
selftest/test1/d3/                             2016-10-02 09:05:02            0
selftest/test1/d3/.htaccess                    2015-03-31 16:55:44          271
...etc...

To extract using absolute paths, on Unix and Windows:

/...$ py3 /Code/ziptools/zip-extract.py
Running ziptools 1.3
Zip file to extract? /Users/blue/Desktop/website.zip
Folder to extract in (use . for here) ? /Users/blue/Desktop/temp/website
Do not localize symlinks (y=yes)? n
Retain access permissions (y=yes)? n
Do not mangle filenames (y=yes)? n
About to UNZIP
      /Users/blue/Desktop/website.zip,
      to /Users/blue/Desktop/temp/website,
      localizing any links,
      not retaining permissions
      mangling filenames
Confirm with 'y'? y
...etc...

c:\...> py -3 C:\Code\ziptools\zip-extract.py
Running ziptools 1.3
Zip file to extract? C:\Users\me\Desktop\website.zip
Folder to extract in (use . for here) ? C:\Users\me\Desktop\temp\website
Do not localize symlinks (y=yes)? n
Retain access permissions (y=yes)? n
Do not mangle filenames (y=yes)? y
About to UNZIP
      C:\Users\me\Desktop\website.zip,
      to C:\Users\me\Desktop\temp\website,
      localizing any links,
      not retaining permissions
      not mangling filenames
Confirm with 'y'? y
...etc...

Portability

This section provides the full story on running ziptools in different contexts. In short:

The net result is that ziptools can serve as your go-to tool for managing zip archives on all your devices. As examples, check out ziptools running on macOS, Windows, Linux, and Android (if you haven't by now).

That said, interoperability is rarely perfect, and the following sections enumerate the footnotes that apply. Some of these can be mitigated by choosing an appropriate Python (as source code, ziptools is especially vulnerable to version skew). Most, however, reflect immutable constraints of other tools, platforms, or Pythons.

Other Tools

ziptools is broadly interoperable with other zip tools: its zipfiles can generally be unzipped by other tools, and it can usually unzip zipfiles created by other tools. As an example, zips created by ziptools running on Python 2.X are correctly unzipped by ziptools under 3.X—and vice versa. Thanks to the zip standard, ziptools' has also been verified to play well with Unix command-line zip and unzip, Finder zips on macOS, explorer zips on Windows, and an assortment of other tools.

The chief caveat here is that some tools may not support ziptools' features as fully as ziptools. In particular, modtimes, symlinks and permissions may not propagate as well; large files may not be supported by some unzips, per above; and ziptools' UTC-timestamp modtimes may be ignored, yielding DST skew. For best results, use ziptools on both the zip and unzip ends of your archive transfers.

Platforms

As suggested by some of the examples earlier in this guide, ziptools works equally well on Unix and Windows. It's been recently verified on Windows 7 and 10, macOS El Capitan and later, Ubuntu Linux, and Android 7 through 11 (including Nougat and Pie). Nevertheless, a handful of well-known platform idiosyncrasies can impact zip results:

On Unix
ziptools works well on Unix—and, by extension, Linux—with no notable caveats. Many of ziptools' extensions deal with Unix-oriented tools (e.g., symlinks and permissions), whose support is naturally best on their home base. In fact, because the fit is so good, this guide's examples are all run on macOS Unix, unless noted otherwise.

Still, Unix inherits a legacy Windows constraint: by spec, zip archives mimic the "local time" modtime scheme of MS-DOS and Windows FAT, instead of using Unix UTC time. This skews times across timezone changes, and can yield different time results from different unzip tools across DST changes. ziptools 1.1 deferred to Python libraries for DST changes and did not address timezones, but 1.2 now accommodates both by saving UTC timestamps to zipfile extra fields. See the earlier coverage of the 1.2 change, and search for "DST" in ziptools.py for more on this topic.

A footnote on Linux: although it can largely be lumped in with Unix in all ziptools regards, it does differ in one arguably trivial way: as tested, Linux silently fails to set symlink permissions, leaving them a fixed 0o777. This also happens when changed from a shell chmod, and means ziptools can't propagate symlink permissions (only) on this platform. Linux also doesn't recognize symlinks created on USB drives by macOS, but the latter may be "cheating" with custom formats. For more on the Linux ziptools story, see its Ubuntu demo session.

On Windows
ziptools works well on Windows—and even has Windows-specific support for long paths and command-line argument globbing. Some utility is limited, however, and some tools require extra steps. Unix-style permissions propagation, for example, translates only to Windows read-only flags and is otherwise irrelevant. An item with permissions 0o444 (read-only) on Unix will unzip on Windows as read-only, and return to unzip on Unix as 0o444, but all other Unix permissions will be lost in transit. See this demo for a Windows round trip's before and after.

Moreover, symlinks on Windows work only on NTFS drives, and even then unzipping—and hence creating—symlinks at all has historically required special Windows permissions. To enable this permission, right-click Command Prompt and select "Run as administrator," or enable Developer Mode in recent versions of Windows 10 (and 11: see the update). For more details, see section "Symlinks—Copied, not Followed" in Mergeall's User Guide here. In ziptools, symlinks survive zips from and unzips to NTFS drives, but they are lost elsewhere, and symlink permission failures on unzips generate a message and stub file but do not end the run (see the demo).

Some Unix-focused tools also have uneven support on Windows. Symlinks to files and folders work on Windows in ziptools, for instance, but under Python 3.X only, and even then do not propagate modtimes or permissions. Python 2.X doesn't support symlinks on Windows at all, so ziptools does what it can: on unzips, symlinks are replaced with stub files so that the rest of the archive can be extracted; on zips, symlinks are followed because Python 2.X's library does not detect them, resulting in unavoidable data copies. The only remedy for these is Python 3.X.

Windows 11 update: per Mergeall's coverage, symlinks can be created on Windows 11 by just setting Developer mode in Settings => Privacy & Security; no special permission or "Run as administrator" mode is required. With both Windows Developer mode and this system's path-separator adjustments covered earlier, ziptools is generally able to recreate Unix symlinks on Windows NTFS drives. Moreover, because a zip is a single self-contained file, the filesystem used to transport it to or from Windows is irrelevant; exFAT drives, for example, will do as go-betweens, even though they don't support raw symlinks on Windows.

On Android
Android is essentially Unix (really, Linux) with proprietary—and even extreme—access constraints: ziptools works on this platform too (e.g., in the Termux and Pydroid 3 apps' command lines), but you must zip and unzip to and from folders accessible to your Python app only. Unlike in Unix and Windows, file storage in Android is a fenced-in resource on unrooted devices. In particular, file writes required for ziptools may be available only in app-specific folders, but the rules have varied across storage mediums and Android releases, are scheduled to change again soon, and are too complex to cover here; visit this page for more on which folders are open to zips.

In terms of specific features, symlink creation normally fails on Android today with an "Operation not permitted" permissions error, due to this platform's emulated filesystem stack. This happens both in ziptools and otherwise; here's the error on unrooted Pie and Nougat. See the searches here and here and the related note here for background on this topic. As on Windows, ziptools on Android replaces such failing symlinks with dummy stub files, so unzips can proceed to extract an archive's non-link items.

For similar filesystem-emulation reasons, Android also silently refuses to support permissions propagation in ziptools or otherwise on unrooted Pie, and throws errors when propagating modtimes and permissions prior to Oreo. Per the same note, the pre-Oreo modtime constraint has been lifted in recent Androids by replacing a FUSE filesystem implementation, but undoubtedly still impacts many users of older devices. In ziptools, errors don't stop unzips on Android, but permissions won't propagate, and modtimes won't propagate on 2016's Nougat and earlier. Study the Nougat and Pie demo sessions to learn more.

Android 11 update 1: a new demo was added to the docetc folder in version 1.3, to show the results of ziptools extracts (unzips) on Android 11. That platform now has three internal storage categories that differ in speed, accessibility, and longevity, and the demo tests all three on a Pixel 4a. In short, content and modtimes work well in all three storage types, but permissions and symlinks are supported only in app-private storage, which is not accessible to any but the sole owning app (apart the non-POSIX access Termux provides with the Storage Access Framework, as covered here by Android Deltas Sync). For the complete Android 11 ziptools story, see the demo, especially its overview and summary file.

Android 11 update 2: on further testing, it turns out that Android 11's filesystem behavior varies by both storage type and device vendor. Specifically, a Galaxy Z Fold3 has nearly the same rules as a Pixel 4a, but the former supports Unix permissions in app-specific storage, while the latter does not. In other words, there is no single Android 11: its behavior can vary arbitrarily from device to device. The Android 11 demo has been updated for the new findings, but Android 11's fragmentation obviously makes it difficult to document, program, and use. Quality engineering will rarely thrive where it must build its house upon the sand.

Android 11 update 3: in late 2021, Mergeall 3.3 uncovered a likely bug in Android 11 shared storage which may preclude writing files with non-ASCII Unicode names in composed (e.g., NFC) format. Though rare, if unzips of such files fail on this platform, see the online work-around script, as well its Android Deltas Sync coverage.

Android 12 and 13 update: in late 2022, ziptools has been used regularly and successfully on Androids 12 and 12L, though 12 introduced a new child-process constraint that may impact some users. Android 13 remains to be tested, but is expected to change little at the systems level where ziptools runs; it may further lock down app-specific storage, but the impact of this remains to be seen.

Other factors not called out in the foregoing list may also place hurdles in the way of cross-platform content. On unzips, for instance, illegal (non-portable) characters in filenames may optionally be mangled to _ on Windows to allow saves, even though this may break later syncs back to the source platform and overwrite files (see the new 1.3 policy and options for this above). Moreover, pathnames botched by some Windows zip programs may cause problems on Unix, as described earlier. In addition, unzipping to drives using some filesystems, including exFAT, may strip archives of their Unix permissions. See the earlier notes for more details on the latter, and always test ziptools in your use case before adopting it broadly.

Lest this all sound too dire, keep in mind that archives which stick to basic files and folders are immune to most of the interoperability concerns noted here. Where that's impossible, archiving is usually better when Unix-oriented tools like symlinks and permissions stay on Unix (and the nod to Unix compatibility on Windows and Android is recognized as half-hearted at best).

What about iOS?: ziptools probably can also be used on iOS devices from a Python app (e.g., Pythonista, with either its Python 3.X or 2.X), but this hasn't been tested; is subject to iOS's extreme access constraints; and may require either program-call mode, or a shim wrapper to spawn a ziptools command line. iOS storage has also historically been less open on unrooted phones than Android; iOS 13 adds some filesystem functionality in its Files app (including macOS-like quickviews and support for zips), but the scope of content is still limited. That said, Android seems bent on imposing similar restrictions, so your mobile mileage may vary.

Pythons

Despite the prior section's critiques, platforms are not the only source of portability limits for ziptools. Because Python's support for some tools can vary by its version number even on the same platform, this section summarizes portability rules from the perspective of Python itself. Being coded in Python and shipped as source, ziptools naturally inherits these same constraints.

Broadly speaking, ziptools' most recent release has been verified to work on Pythons 2.7 through 3.9 and the zipfile modules these Pythons include; later Pythons are expected to run ziptools correctly too unless Python or its zipfile change incompatibly (see also this section's update ahead). ziptools also works almost identically on Python 3.X and 2.X, and the two Pythons perform equally well in almost all regards—including support for non-ASCII filenames in 2.X as of 1.1.

Still, Python 3.X holds a slight advantage for archives containing symlinks on Unix, and Python 3.X is fully required to retain symlinks on Windows. In complete detail, Python's current symlink support across platforms is both convoluted and mixed:

On Unix
On Unix (and its kind), Python 3.X can read and write symlinks, and can propagate their permissions and modtimes. Python 2.X can read and write symlinks, and can propagate their permissions but not their modtimes.
On Windows
As noted earlier, Python 3.X can read and write symlinks on Windows, but is unable to propagate either their modtimes or their permissions. Python 2.X cannot read, write, or even detect symlinks on Windows, and hence does not support them there at all. Additionally, only Python 3.2 and later can detect folder-link cycles on Windows, though very few Python 3.0 or 3.1 installs likely see action today.
On Android
Per the prior section, platform prohibitions derail symlinks in Python 3.X; though untested, the same fate likely applies to 2.X.

In sum, if you do not zip symlinks, Python 2.X is as good a choice as 3.X; if you do, 3.X is a better option everywhere. As noted in the prior section, it's also generally advised to maintain your symlinks on Unix, though Python 3.X makes them as portable as possible if they cross platform boundaries. By contrast, permissions may be lost altogether if round tripped between some platforms—in any Python.

It's also worth adding that Python 3.X may hold a speed advantage over 2.X. Although the same ziptools code runs unchanged on 3.X and 2.X, the underlying Python zipfile module differs between the two lines. In testing with 3.9 and 2.7, content results are the same (sans the functional differences noted above), but speed can differ: 2.X and 3.X perform roughly the same, except that 2.X extracts (unzips) can check in three times slower than 3.X. The cause of this remains unisolated, but Python 3.X may also be the better host if speed is a premium.

Despite all the platform and Python limitations and tradeoffs outlined in the last two sections, it's important to note zipfiles are still both practical and widely useful, and serve as the basis for countless content archives and media packages. In computing, the line between unusable and workable is always up to practice to define.

Python failsafe copy: as of 1.3, ziptools now ships a working copy of Python 3.X's zipfile module, as a safeguard against future changes. Find it in the docetc extras folder here, and copy it up to the root of the install package if zipfile ever changes in ways that break ziptools. ziptools does not need or use this file today, and using it would preclude adapting any future improvements in it, but it's available both for reference and as a last-resort measure. The copy is from 3.9 and won't work with Python 2.X, though 2.X's recent "end of life" ensures that it won't be changed and break in ways that are all too common in 3.X. Retirement has its upsides.

See Also

For more information on portability:

ziptools.py...
has more info on Python's symlink constraints, including a version/platform support table for developer reference; see its section titled PYTHON SYMLINKS SUPPORT.
py-2.X-3.X-zipoff.txt...
demos ziptools' results under Python 2.X on Unix, and compares them to Python 3.X (spoiler: they're identical, sans symlink modtimes).
py-2.X-fixes.txt...
chronicles ziptools 1.1's fixes for non-ASCII filenames under Python 2.X. Prior to 1.1, zipfiles made under Python 2.X could yield munged non-ASCII filenames in some unzip contexts on both Unix and Windows. 1.1 solved this by forcing filenames to Unicode in creates, to invoke encoding in 2.X's zipfile that is more interoperable (e.g., with 3.X unzips). 1.1 also worked around a related 2.2 exception when printing non-ASCII to a pipe (only!).

Versions

This section briefly summarizes the highlights of ziptools versions. Search for "[N.M]" in code files to see changes applied by recent releases (e.g., 1.3 changes are tagged with "[1.3]").

Package dates versus release dates: if the date on your ziptools download package is later than the latest release date listed here, it just means that trivial non-code changes were applied. This includes the inevitable doc-typo fixes, as well as minor doc updates like the sidebars recently inserted here, here, and here. Actual code changes will always trigger either a new release number and date for significant changes, or a note here for minor revisions.

Version 1.3: Oct-28-2021

ziptools 1.3 introduces a faster uncompressed option, calls out name mangling to make it transparent, shortens unzip output for common usage, and polishes its UI, docs, and builds. Some changes are demoed in docetc; look for names ending in 1.3. Here are the highlights of this release, with links to more info:

Uncompressed zipfiles
Support uncompressed zipfiles for faster zip and unzip of large archives (above)
Better filename mangles
Make nonportable-filename changes explicit, optional, and avoidable (above)
Briefer extracts output
Collapse extracts' two output lines into one when source == destination (above)
Abort gracefully on control-c input
Print a nice message instead of a Python traceback in interactive mode (above)
Upgraded cruft defaults
Tighten up preset cruft definitions, though yours may vary (above)
Failsafe zipfile copy
Include a copy of Python's zipfile, to use if^H^Hwhen it changes (above)
New topic coverage
Expand docs to cover performance, Windows \, and Android 11
Ziptools zip thyself
Use an automated build script, which doubles as a usage example
Etcetera
Show version (see the demo), fix obscure buglets (see "[1.3]" in the code)

Later minor updates: an Oct-2022 rerelease of 1.3 added a .nomedia file for Android; a space inadvertently omitted from the create script's usage message; more info on symlinks here and here; an Android 12 note; and better online-index layout. The .nomedia prevents Android galleries from assimilating this package's screenshots (while this lasts). Apart from the trivial usage-message formatting, no code or program functionality was changed.

Version 1.2: Apr-11-2020

This release was a quick follow-up to 1.1. It added two major functionality upgrades that came online faster than anticipated, and extended 1.1's platform-specific reach. See folder docetc/1.2-upgrades/ for demos and screenshots of these 1.2 enhancements:

UTC timestamps
Neutralize DST and timezones in modtimes with UTC timestamps (above)
Alternate zip paths
Record items at an alternative path given a -zip@path argument (above)
More Unicode prints
Use 1.1's text munging to avoid aborts in interactive mode on Windows (code)
Android modtime skips
Modtimes can't be changed till Oreo: ignore errors on earlier versions (above)

Version 1.1: Apr-2-2020

This release introduced functionality upgrades and fixes for both core and platform-specific support. See folder docetc/1.1-upgrades/ for demos and screenshots of these 1.1 enhancements:

Permissions
Extracts propagate Unix-style permissions for all items on request (above)
Symlinks
Creates save per-link permissions for symlinks, instead of a constant (code)
Auto-globs
The create script expands source-item wildcard patterns on all platforms (above)
Interface
The create and extract scripts have better error detection and reporting (demo)
Statistics
Extracts and creates both return item counts, printed by command scripts (demo)
Symlinks fix
Extracts fix an obscure abort: allow extracting unnested symlinks to . (code)
Python 2.X fixes
2.X zips non-ASCII filenames interoperably, and avoids a print exception (above)
macOS exFAT fix
Extracts work around a macOS exFAT bug: force folder modtime updates (shot)
Windows fixes
ziptools handles absent symlink support in 2.X, and non-ASCII prints (code)
Android fixes
ziptools handles symlink permission errors in emulated filesystems (code)
HTML docs
This README is HTML instead of plain text, for usability and readability (olde)

Version 1.0: Jun-12-2017

The initial version, developed for and released with the Mergeall 3.0 package, but now available and developed separately (a software spin-off of sorts). See folders selftest/, cmdtest/, and moretests/ for demos of 1.0 (and later) usage.

Future Ideas

This section wraps up with a few blue-sky thoughts on possible future enhancements:

Frozen executables
Given ziptools' recent features growth, it might be useful to package its main scripts as frozen executables for major platforms. By bundling a specific Python 3.X in such an executable, this would remove an entire category of portability issues (see the preceding section). It would also simplify installs, and make ziptools immune to future changes in Python and its zipfile module. The latter is especially a concern, given the UTC-timestamp and filename-mangling implementations' tight dependence on the current coding structure of zipfile (e.g., see zipmodtimeutc.py).

On the other hand, source code must still be shipped for program-library-mode use; this may not be seamless or possible for command-line scripts on some platforms (e.g., Android); and the results may be inferior in some contexts (e.g., Windows single-file executables may require an SSD to offset slow start-up). For now, install a Python separately as needed; see the extras folder for a working copy of Python 3.X's zipfile if it ever changes in ways that break ziptools; and watch this space for new developments.

Other metadata
It's also worth noting that ziptools' metadata coverage is broad but open ended. It propagates basic content, symlinks, UTC modtimes, and Unix permissions explicitly, but does nothing about Unix file flags, which have sketchy symlink support across Python lines; may or may not be easily smuggled in zipfiles; and boast use cases which are few or nil. There are also the rabbit holes of extended attributes and macOS resource forks, which seem better filed as platform-specific eccentricities than crucial extensions.
And so on
Finally, an argument might be made that at least some parts of ziptools could be coded as subclasses of zipfile components instead of supplemental functions. And a portable GUI may be nice too. Scope creep happens...



[Python Logo] Top Code Docs Test Page Apps Input ©M.Lutz