File: mergeall-products/unzipped/test/ziptools/zip-create.py
#!/usr/bin/python """ ============================================================================== zip-create.py - a ziptools command-line client for zipping zipfiles. See ziptools' ./_README.html for license, attribution, and other logistics. Create (zip) a zip file, with: <python> zip-create.py [zipfile source [source...] [-skipcruft] [-atlinks] [-zip@path] [-nocompress]] Where: "zipfile" is the pathname of the new zipfile to be created (a ".zip" is appended to the end of this if missing) each "source" (one or more) is the relative or absolute pathname of a file, link, or folder to be added to the zipfile "-skipcruft", if used, avoids adding hidden or platform-specific items to the zipfile (else, nothing is skipped, as described ahead) "-atlinks", if used, adds items that any symlinks refer to instead of the symlinks themselves (else, links are always added verbatim) "-zip@path", if used, gives an alternate path to be used as the root of all items in the zip (and later unzips); path "." means unnested [1.2] "-nocompress", if used, disables the standard compression used for zipped items; use for faster zips and unzips of large content sets [1.3] Arguments are input at the console if not listed on the command line. The script's output lists all items added to the zipfile; when using "-zip@path", it also lists zip paths that differ from source paths on a second output line that begins with a tabbed "=>" sequence. A "*" in a source expands into multiple sources on all platforms. Control-c anywhere in interactive mode terminates a run not started. <python> is your platform's optional Python identifier string. It may be "python", "python3", or an alias on Unix; and "python", "py -3", or "py" on Windows. It can also be omitted on Windows (to use a default), and on Unix given executable permission for this script (e.g., post "chmod +x"). Some frozen app/executable packages may also omit <python>; see your docs. Examples: python zip-create.py # input args python zip-create.py tests.zip test1 test2 test3 # zip 3 dirs python zip-create.py -skipcruft upload.zip webdir # skip cruft python zip-create.py newzip dir -skipcruft -atlinks # follow links python zip-create.py allcode.zip *.py test?.txt # wildcards python zip-create.py ../allcode.zip * -skipcruft # items in dir python zip-create.py allcode.zip folder/* -zip@. # remove dir nesting python zip-create.py folder.zip folder -nocompress # uncompressed/fast ABOUT CRUFT SKIPPING: The optional "-skipcruft" argument can appear anywhere if used. When used, it prevents normally-hidden system metadata files and folders from being included in the generated zipfile. Cruft defaults to all items whose names start with a "." (the Unix convention), plus a handful of others as defined in the pattern lists imported from module file zipcruft.py; customize these lists here or in the module as desired. Most end-user zips should pass "-skipcruft" to enable cruft skipping. This functionality is especially useful on a Mac, to avoid common files like ".DS_Store" and "._somename" in zips used to distribute software or upload websites. If "-skipcruft" is _not_ used, every file and folder named in a 'source' is included in the zipfile. For more background on cruft, see the overview in Mergeall's documentation usage pointers, at learning-python.com/mergeall/UserGuide.html. Note that cruft skipping is implemented in this create script and the ziptools function is uses, but not in the extract script or function. This is by design: the create/extract tools work together as a pair. To remove cruft after unzipping a file created by other tools, see Mergeall's nuke-cruft-files.py script. ABOUT LINKS AND OTHER FILE TYPES: By default, the ziptools package zips and unzips symbolic links to both files and dirs themselves, not the items they refer to; use "-atlinks" (which also can appear anywhere) at creation time here to zip and unzip items that links refer to instead. This package also always skips FIFOs and other exotica. See ziptools.py for more details. ABOUT SOURCE PATHS: Path separators in created zipfiles always use Unix '/', even on Windows. This is in accordance with the zip standard, and ensures interoperability. This script allows source items to be named by either relative or absolute pathnames, and generally stores items in the zip file with the paths given. When extracted, items are stored at their recreated paths relative to an unzip target folder (see zip-extract.py for the extract side of this story). In more detail, this script does nothing itself about any absolute paths (e.g., "/dir"), relative path up-references (e.g., "..\dir"), or drive and UNC network names on Windows (e.g.,"C:\", "\\server") on creates. The Python zipfile module used here (and ziptools' symlink adder that parrots it) strips any leading slashes and removes both drive and network names and embedded ".." on archive writes, but other oddities, including leading "..", will be retained in the created zip file's item names. Some zip tools may have issues with this (e.g., WinZip chokes on ".."), but the companion script "zip-extract.py" here will always remove all of these special-case syntaxes, including leading "..", to make item extract paths relative to (and hence stored in) the unzip destination folder, regardless of their origin. See that script for more details. Still, if you're going to use this script's output in other zip tools, for best results run it from the folder containing the items you wish to zip (or its parent), avoiding ".."-rooted paths: c:\> cd YOUR-STUFF c:\YOUR-STUFF> py -3 scriptpath\zip-create.py thezip x y z The zipfile module's write() also allows an extra 'arcname' argument to give an archive (and hence extract) pathname for an item that differs from its filename, but it's not exposed for end-users here (it is used by ziptools, but only internally to distinguish local-file from archive paths as part of the support for '\\?'-prefixed long paths on Windows). [See the update ahead: "-zip@path" now supports a zip-wide 'arcname'.] Python's os.path.commonpath() (available in 3.5 and later only) or other might be used to remove common path prefixes as an option if all items are known to be in the same path, but it is not employed here - the full paths listed on the command line are stored in the zipfile and will be recreated in later extracts relative to an extract target dir. For example, a file named as 'a/b/c/f.txt' is zipped and unzipped to an extract target folder E as 'E/a/b/c/f.txt', even if all other items zipped are in 'a', 'a/b', or 'a/b/c'. Hence, if you wish to minimize common path prefixes in the zip, cd to a common folder of zip sources before running this script, if warranted in a given use case. UPDATE [1.2] - ALTERNATE ZIP PATHS: ziptools version 1.2 adds a "-zip@path" command-line option (and its corresponding function-call argument), which replaces the given path in all zipped items with an alternate path. This can be used to change, expand, shorten, or fully remove the paths of zipped items, and hence the paths at which they are unzipped. A "-zip@.", for instance, makes items top-level and unnested, and renders pre-zip cd commands largely optional. For details and examples, see: "_README.html#altpaths12". MORE ABOUT SOURCES: WILDCARDS AND DOT Also note that source arguments can include any number of folders, files, or both. Any Unix-style "*"s in sources are applied before this script runs, and may expand to either file or folder names. If you list just simple files as sources and no folders (with or without any Unix "*" expansions), no folder nesting occurs in the created zipfile or its extraction (the zipfile will be all top-level files). If you list folders, they will be recreated in the extract. See test-simple-files/ in moretests/ for an example of file-only zips. Example: you can include the entire contents of a folder as unnested top-level items in the zip, by running a zip with a "*" source after a cd into the subject folder, and using a zipfile target path outside the folder being zipped (including a zip in itself may get stuck): cd dir; $TOOLS/zip-create.py ../allhere.zip * -skipcruft This avoids folder nesting on extracts for all items in the folder: the zipfile can be extracted directly in its files' destination, and items need not be moved or copied after the extract. By contrast, a source "dir/*" or "dir" will instead record items as nested in the zip, and extract the items within their "dir" folder. This is better for multiple folders that may have same-named items, and may be safer (an accidental unzip won't trash files in "."). Special case: using "." (the current working directory) as a source argument zips all items in the '." folder as top-level, unnested items. This is an implementation artifact, and is roughly the same as "*", except that "." will zip ".xxxx" hidden files and "*" (globs) won't. UPDATE [1.2] - WILDCARDS AND ALTERNATE ZIP PATHS: Items zipped from a folder can also be made unnested in a zipfile with the 1.2 alternate-zip-path extension described above, and without a preliminary cd command. The equivalent to the above: $TOOLS/zip-create.py allhere.zip dir/* -skipcruft -zip@. UPDATE [1.1] - WILDCARDS ON WINDOWS: As an accommodation to Windows usage, this script now automatically expands (a.k.a. 'globs') any "*" wildcards in sources, if not expanded by the shell. It also matches any remaining "?" single-character and "[]" range operators in sources. This means you can use "*" and the others in a Windows DOS shell, and elsewhere, to expand into matching file and folder names. Although primarily meant for Windows users who don't want to use their Linux subsystem, this also works in interactive mode, and for quoted operators in Unix shells, applying the Python "glob" module uniformly on all platforms to expand source patterns. This may also be useful in IDEs that support command lines but don't pass them through shells. For more on allowed patterns, see: https://docs.python.org/3.5/library/glob.html. The glob is case sensitive only on OSs that are too (i.e., Unix). Note that auto-globs are performed only for command lines here; calls to the ziptools.createzipfile() must glob.glob() sources manually. [Former update: you may also use "*" expansions on Windows by running ziptools' scripts from the bash shell in the Windows Subsystem for Linux that's now part of Windows 10; see the web for pointers.] [Former caveat: this could support "*" expansion on Windows too, by running source arguments through glob.glob(), though Windows can run Unix-like shells (e.g., via cygwin). If required, write a simple launcher script that runs this script with os.system(), and send it the ' '.join() for glob.glob() or os.listdir() run on sources.] ABOUT LARGE FILES ziptools always uses the ZIP64 option of Python's zipfile module to support files larger than zip's former size limits, both for zips and unzips (i.e., creates and extracts). Unfortunately, some Unix "unzip" command-line programs may fail or refuse to extract zipfiles created here that are larger than 2 (or 4) G. Both the zip-extract.py script here and Finder clicks on Mac OS handle such files correctly, and other third-party unzippers may as well. If none of these are an option you may need to split your zip into halves/parts, but this is a last resort; if you can find or install any recent Python 2.X or 3.X on the unzip host, it will generally suffice to run ziptools' zip-extract.py for large files. ABOUT PERMISSIONS Permissions are requested on extracts only, not for creations here; create always stores permissions in the zip, even for symlinks in [1.1]. See the README and extract script for more on permissions propagation. See zip-extract.py for usage details on the zip-extraction companion script. See ziptools/ziptools.py's docstring for more on this script's utility. Coding notes, [1.1] auto-glob: - The addition expands any unexpanded * and ? wildcards and [] ranges: sources ['*.py', 'FILELINK?', 'plain', 'FILELINK[12]', 'FILE*INK?'] expand the same as they do unquoted in a Unix shell, on any platform (though glob.glob filters out nonexistent names and omits any '.*'). - A "t = []; list(map(t.extend, globs))" would do the same as code below. ============================================================================== """ from __future__ import print_function # py 2.X, currently optional here import ziptools, sys, os # get ziptools/ package here from __version__ import showVersion # [1.3] display version number showVersion() # portability RunningOnPython2 = sys.version.startswith('2') RunningOnWindows = sys.platform.startswith('win') if RunningOnPython2: input = raw_input # py 2.X compatibilty import glob, operator # [1.1] autoglobs if not RunningOnPython2: # reduce import required in py 3.X from functools import reduce # import ok but spurious in py 2.X # defaults: customize as desired from ziptools import cruft_skip_keep # avoid Windows Unicode printing errors by munging [1.2] from ziptools import print usage = 'Usage: ' \ '<python> zip-create.py ' \ '[zipfile source [source...] [-skipcruft] [-atlinks] [-zip@path] [-nocompress]]' interactive = False # It makes no sense to try to keep the Windows console open on exit unless # interactive: command-line args imply that this is not an icon-click run. # ['PROMPT' not in os.environ] loosely IDs icon click, but is overkill here. def error_exit(message): print(message + ', run cancelled.') print(usage) if interactive and RunningOnWindows: input('Press enter to close.') # clicked on Windows: stay up sys.exit(1) def okay_exit(message): print(message + '.') if interactive and RunningOnWindows: input('Press Enter to close.') # ditto: stay open on Win sys.exit(0) # or os.isatty(sys.std{in,out}) def reply(prompt=''): if prompt: prompt += ' ' try: return input(prompt) # exit gracefully on control+c [1.3] except KeyboardInterrupt: okay_exit('\nRun aborted by control-c') # command-line mode if len(sys.argv) >= 3: # 3 = script zipto source... skipcruft = {} if '-skipcruft' in sys.argv: # anywhere in argv skipcruft = cruft_skip_keep sys.argv.remove('-skipcruft') atlinks = False if '-atlinks' in sys.argv: # anywhere in argv atlinks = True sys.argv.remove('-atlinks') nocompress = False if '-nocompress' in sys.argv: # anywhere in argv nocompress = True sys.argv.remove('-nocompress') # zip-at path [1.2] zipat = None zipix = [ix for (ix, val) in enumerate(sys.argv) if val.startswith('-zip@')] if len(zipix) > 1: error_exit('Only one -zip@ allowed') elif zipix: ziparg = sys.argv.pop(zipix[0]) zipat = ziparg.split('@')[1] # okay if empty: '' same as '.' if len(sys.argv) < 3: error_exit('Too few arguments') # [1.3] rstrip zipto, else makes dir/.zip (sources okay) zipto, sources = sys.argv[1], sys.argv[2:] zipto = zipto.rstrip(os.sep) zipto += '' if zipto[-4:].lower() == '.zip' else '.zip' # some args, but not enough [1.1] elif len(sys.argv) > 1: error_exit('Too few arguments') # interactive mode (e.g., some IDEs) else: interactive = True zipto = reply('Zip file to create?').strip() or '_default' # [1.1] +strip, dflt zipto = zipto.rstrip(os.sep) # [1.3] else dir/.zip zipto += '' if zipto[-4:].lower() == '.zip' else '.zip' sources = reply('Items to zip (comma separated)?') sources = [source.strip() for source in sources.split(',')] skipcruft = reply('Skip cruft items (y=yes)?').lower() == 'y' skipcruft = cruft_skip_keep if skipcruft else {} atlinks = reply('Follow links to targets (y=yes)?').lower() == 'y' zipat = reply('Alternate zip path (.=unnested, enter=none) ?') or None # [1.2] nocompress = reply('Disable item compression? (y=yes)?').lower() == 'y' # [1.3] # use print() to avoid Unicode aborts in input() [1.2] print("About to ZIP\n" "\t%s,\n" "\tto %s,\n" "\t%s cruft,\n" "\t%sfollowing links,\n" "\tzip@ path %s,\n" "\t%scompressing items\n" "Confirm with 'y'? " % (sources, zipto, 'skipping' if skipcruft else 'keeping', '' if atlinks else 'not ', '(unused)' if zipat == None else repr(zipat), 'not ' if nocompress else ''), end='') verify = reply() if verify.lower() != 'y': okay_exit('Run cancelled') # catch user errors asap [1.1] for source in sources: if not any(c in source for c in '*?[') and not os.path.exists(source): error_exit('Source file "%s" does not exist' % source) # auto-glob: expand unexpanded *, ?, [] (see coding notes above) [1.1] sources = reduce(operator.add, [glob.glob(source) for source in sources]) # post glob removals of invalids if not sources: error_exit('No existing source files provided') # os.remove(zipto) not required: zipfile opens it in 'wb' mode # the zip bit stats = ziptools.createzipfile( zipto, sources, cruftpatts=skipcruft, atlinks=atlinks, zipat=zipat, nocompress=nocompress) okay_exit('Create finished: ' + str(stats)) # [1.1]