#!/usr/bin/python """ ============================================================================= zip-extract.py - a ziptools command-line client for unzipping zipfiles. See ziptools' ./_README.html for license, attribution, and other logistics. Extract (unzip) a zip file, with: zip-extract.py [zipfile [unzipto] [-nofixlinks] [-permissions] [-nomangle]] Where: "zipfile" is the pathname of an existing zipfile (a ".zip" is appended to the end of this if missing) "unzipto" is the pathname of a possibly existing folder where all unzipped items will be stored (the default is ".", the current working directory) "-nofixlinks", if used, prevents symbolic-link path separators from being adjusted for the local platform (else they are, to make links portable) "-permissions", if used, causes file-access permissions to be propagated to extracted items; use this when unzipping files that originated on Unix [1.1] "-nomangle", if used, prevents automatic replacement of nonportable filename characters with "_" when extracts fail with unmangled names [1.3] Arguments are input at console prompts if not listed on the command line. For each item, the script's output lists both zipfile (from) and extracted (to) name, the latter after a "=>" on a new line. Exception: as of Aug-2021, items whose from and to pathnames are the same are displayed as a single line to reduce output volume; this is common when extracting to the current directory ("."). All other cases still display two output lines as before. Control-c anywhere in interactive mode terminates a run not yet started. is your platform's optional Python identifier string. It may be "python", "python3", or an alias on Unix; and "python", "py -3", or "py" on Windows. It can also be omitted on Windows (to use a default), and on Unix given executable permission for this script (e.g., post "chmod +x"). Some frozen app/executable packages may also omit ; see your docs. The "unzipto" folder is created automatically if needed, but is cleaned of its contents before the extract only if using interactive-prompts mode here and cleaning is confirmed. Neither the base extract function nor non-interactive mode here do any such cleaning. Remove the unzipto folder's contents manually if needed before running this script. Caution: cleaning may not make sense for ".", the current working dir. This case is verified with prompts in interactive mode only, but that is the only context in which auto-cleaning occurs. Examples: python zip-extract.py # input args python zip-extract.py tests.zip # unzip to '.' python zip-extract.py download.zip dirpath # unzip to other dir python zip-extract.py dev.zip . -nofixlinks # don't adjust links python zip-extract.py pkg.zip dirto -permissions # propagate permissions python zip-extract.py pkg.zip dirto -nomangle # don't try to fix names ABOUT LINKS AND OTHER FILE TYPES: For symbolic links to both files and dirs, the ziptools package either zips links themselves (by default), or the items they refer to (upon request); this extract simply recreates whatever was added to the zip. FIFOs and other exotica are never zipped or unzipped. To make links more portable, path separators in link paths are automatically adjusted for the hosting platform by default (e.g., '/' becomes '\' on Windows); use "-nofixlinks" (which can appear anywhere on the command line) to suppress this if you are unzipping on one platform for use on another. See ziptools.py's main docstring for more details. ABOUT TARGET PATHS: For extracts, the Python zipfile module underlying this script discards any special syntax in the archive's item names, including leading slashes, Windows drive and UNC network names, and ".." up-references. The ziptools symlink adder parrots the same behavior. Hence, paths that were either absolute, rooted in a drive or network, or parent-relative at zip time become relative to (and are created in) the "unzipto" path here. Items zipped as "dir0", "/dir1", "C:\dir2", and "..\dir3" are extracted to "dir0", "dir1", "dir2", and "dir3" in "unzipto". Technically, zipfile's write() removes leading slashes, drive and network names, and embedded ".." (they won't be in the zipfile), and its extract() used here removes everything special, including leading "..". Other zip tools may store anything in a zipfile, and may or may not be as forgiving about leading "..", but the zip-create and zip-extract scripts here are meant to work as a team. Note that all top-level items in the zipfile are extracted as top-level items in the "unzipto" folder. A zipfile that contains just files will not create nested folders in "unzipto"; a zipfile with folders will. Caution: top-level items may silently overwrite items in "unzipto", even in "."; unzip to temporary folders to avoid unwanted overwrites. See also the 1.2 "-zip@path" create option for collapsing zip paths. Also note that ziptools assumes that path separators in zipfiles use Unix '/' in accordance with the zip standard, and uses '/' in its own creates (zips) on Windows. Tools which instead use '\' on Windows are buggy and should be avoided; a '\' is a valid filename character on Unix, and hence cannot be interpreted as a separator interoperably. ABOUT LARGE FILES: ziptools always uses the ZIP64 option of Python's zipfile module to support files larger than zip's former size limits, both for zipping and unzipping. Unix "unzip" may not. See zip-create.py for more details. ABOUT PERMISSIONS: (Former caveat: extracts here did not preserve Unix permissions due to a Python zipfile bug; see extractzipfile() in ziptools/ziptools.py.) UPDATE: as of [1.1], extracts (unzips) now do propagate Unix permissions for files, folders, and symlinks, but only if this is requested with the new "-permissions" argument. This should generally be used only on Unix zipfiles; see ziptools/ziptools.py's extractzipfile() for more details. Notes: some examples from prior releases may not show this new option; permissions are requested on extracts only, not creations (which always save permissions); and this option has no effect on filesystems that do not support Unix-style permissions (including exFAT: you can copy a zipfile to/from exFAT, but unzip on a different drive for permissions). ABOUT MODTIMES: (Former caveat: extracts here deferred to Python libraries to adjust modtimes of zipped items for the local DST phase, which may or may not have agreed with other tools, and did not address timezone changes.) UPDATE: as of [1.2], ziptools now stores UTC timestamps for item modtimes in zip extra fields, and uses them instead of zip's "local time" on extracts. This means that modtimes of zipfiles zipped and unzipped by ziptools are immune to changes in both DST and timezone. For more details, see the README's "_README.html#utctimestamps12". The former local-time scheme is still used for zipfiles without UTC. ABOUT FILENAME MANGLES: Filenames containing "|", "?", ":", and others are nonportable, and cannot be saved on some filesystems by unzips. As of 1.3, this script by default automatically mangles (sanitizes) names that fail on Windows only, by replacing all nonportable filename characters with "_" and trying to extract again. This name mangling allows saves, but has a rare potential to overwrite other files and break later syncs. To make this transparent, ziptools reports mangles and their tallies in run output. To avoid mangles in full, pass "-nomangle" and run the included fix-nonportable-filenames.py to analyze and fix nonportable names manually before content transfers. With "-nomangle", unmangled filenames with characters illegal on the unzip target will fail to extract and be skipped with a message. Shared storage in some versions of Android has filename constraints similar to Windows, but no auto-mangling is performed on this platform due to an Android 11 bug; run the fixer script before unzipping to shared storage. See also _README.html#nomangle for more details. CAVEAT - PORTABILITY: Unzipped symlinks work on Windows, but don't retain modtimes on that platform (they are stamped with the unzip time instead), due to known/fixed limitations. See the _README.html's symlinks coverage. UPDATE: as of [1.1], there's more thorough coverage of portability issues like this in the README's "_README.html#Portability". See zip-create.py for usage details on the zip-creation companion script. See ziptools/ziptools.py's docstring for more on this script's utility. Coding note: the "Do not localize" negative logic is too late to change... ============================================================================= """ from __future__ import print_function # py 2.X, currently optional here import ziptools, sys, os from __version__ import showVersion # [1.3] display version number showVersion() # portability RunningOnPython2 = sys.version.startswith('2') RunningOnWindows = sys.platform.startswith('win') if RunningOnPython2: input = raw_input # py 2.X compatibility # avoid Windows Unicode printing errors by munging [1.2] from ziptools import print usage = 'Usage: ' \ ' zip-extract.py [zipfile [unzipto] [-nofixlinks] [-permissions] [-nomangle]]' interactive = False # see zip-create.py note about Windows icon clicks and exits def error_exit(message): print(message + ', run cancelled.') print(usage) if interactive and RunningOnWindows: input('Press enter to close.') # clicked on Windows: stay up sys.exit(1) def okay_exit(message): print(message + '.') if interactive and RunningOnWindows: input('Press Enter to close.') # ditto: stay open on Win sys.exit(0) # or os.isatty(sys.std{in,out}) def reply(prompt=''): if prompt: prompt += ' ' try: return input(prompt) # exit gracefully on control+c [1.3] except KeyboardInterrupt: okay_exit('\nRun aborted by control-c') # command-line mode if len(sys.argv) >= 2: # 2 = script zipfile... nofixlinks = permissions = nomangle = False if '-nofixlinks' in sys.argv: # anywhere in argv nofixlinks = True sys.argv.remove('-nofixlinks') if '-permissions' in sys.argv: # anywhere in argv permissions = True sys.argv.remove('-permissions') if '-nomangle' in sys.argv: # anywhere in argv nomangle = True sys.argv.remove('-nomangle') if len(sys.argv) not in [2, 3]: error_exit('Too few or too many arguments') zipfrom = sys.argv[1] zipfrom += '' if zipfrom[-4:].lower() == '.zip' else '.zip' unzipto = '.' if len(sys.argv) == 2 else sys.argv[2] if unzipto.startswith('-'): error_exit('Too few or too many arguments') # interactive mode (e.g., some IDEs) else: interactive = True zipfrom = reply('Zip file to extract?').strip() or '_default' # [1.1] +stp/dft zipfrom += '' if zipfrom[-4:].lower() == '.zip' else '.zip' unzipto = reply('Folder to extract in (use . for here) ?').strip() or '.' nofixlinks = reply('Do not localize symlinks (y=yes)?').lower() == 'y' permissions = reply('Retain access permissions (y=yes)?').lower() == 'y' nomangle = reply('Do not mangle filenames (y=yes)?').lower() == 'y' # use print() to avoid Unicode aborts in input() [1.2] print("About to UNZIP\n" "\t%s,\n" "\tto %s,\n" "\t%slocalizing any links,\n" "\t%sretaining permissions,\n" "\t%smangling filenames\n" "Confirm with 'y'? " % (zipfrom, unzipto, 'not ' if nofixlinks else '', 'not ' if not permissions else '', 'not ' if nomangle else ''), end='') verify = reply() if verify.lower() != 'y': okay_exit('Run cancelled') # catch user errors asap [1.1] if not os.path.exists(zipfrom): error_exit('Zipfile "%s" does not exist' % zipfrom) if not os.path.exists(unzipto): # no need to create here: zipfile.extract() does os.makedirs(unzipto) pass else: # in interactive mode, offer to clean target folder (ziptools.py doesn't); # removing only items to be written requires scanning the zipfile: pass; if (interactive and reply('Clean target folder first (yes=y)?').lower() == 'y'): # okay, but really? if (unzipto in ['.', os.getcwd()] and reply('Target = "." cwd - really clean (yes=y)?').lower() != 'y'): # a very bad thing to do silently! pass else: # proceed with cleaning for item in os.listdir(unzipto): itempath = os.path.join(unzipto, item) if os.path.isfile(itempath) or os.path.islink(itempath): os.remove(ziptools.FWP(itempath)) elif os.path.isdir(itempath): ziptools.tryrmtree(itempath) # the zip bit stats = ziptools.extractzipfile( zipfrom, unzipto, nofixlinks=nofixlinks, permissions=permissions, nomangle=nomangle) okay_exit('Extract finished: ' + str(stats)) # [1.1]