#!/usr/bin/env python3 # -*- coding: utf8 -*- r""" ================================================================================ deltas.py: a fork of mergeall.py for saving delta sets [3.2] To make deltas: $ python3 deltas.py DIRDELTA FROM TO -skipcruft -quiet To apply deltas: $ python3 mergeall.py DIRDELTA TO -restore -auto -skipcruft -backup -quiet Synopsis: this mergeall.py variant saves changes (a.k.a. deltas) made to FROM separately, instead of applying them to TO immediately. The changes saved in the DIRDELTA folder may be archived, and may be used to bring a TO content copy in sync later using mergeall.py's "-restore" option. Version: now part of all Mergeall packages [3.3] License: provided freely, but with no warranties of any kind Author: © M. Lutz 2019-2021 (learning-python.com) Web: learning-python.com/{mergeall.html, fold3-vs-deltas.html} Demo: test/test-deltas-3.2 Hosts: runs on macOS, Windows, Linux, Android, and others Python: runs on 3.X and 2.X (e.g., 3.9 and 2.7), 3.X strongly preferred ====== STATUS ====== [3.3] For Mergeall 3.3, this script was updated for new difference-list structures, and their unnormalized Unicode from/to filenames. For more on this change, see fixunicodedups.py's top-of-file docstring. With 3.3, this script also issues a message for each __added__.txt item, as "listed", and omits normalization messages for non-NFC filenames if passed "-quiet". This script was initially released only in Mergeall's source-code package but is now in all packages as of the Feb-2022 rebuild. To install, fetch Mergeall's source-code or other packages (learning-python.com/mergeall.html) and unzip; this script is in the top-level Mergeall-source/ folder in the source code, and available as a frozen executable in other packages. This script is a fork, both because its command-line pattern differs from mergeall.py's (it requires an additional argument, and doesn't support others), and because mergeall.py is largely frozen to major changes. To accommodate this script, a handful of smaller mods were applied to Mergeall's base code, including: - mergeall.py's comparelinks() was augmented to skip read errors, an issue for BDRs uncovered here - mergeall.py's mergetrees() prints just one message for unique TOs skipped during -restore runs - mergeall.py's excludeskips() was modified to skip __added__.txt in FROM so it's not propagated - backup.py's removeprioradds() now backs up its deletions, so delta-set updates can be rolled back - backup.py's removeprioradds() now mangles nonportable characters in __added__.txt names on Windows - backup.py's noteaddition() was fixed for the same trailing-slashes + unnested-backups issue Search for "[3.2]" in the above files to locate all base-code changes. Also search for '#DELTAS' in this file to find all changes applied in this fork itself. The latest mods here, most-recent first: - Pushed out to app and executable packages with the Feb-2022 rebuild - Fix for folder-path args with trailing slashes and unnested content - Verified on Python 2.X, though 3.X is preferred for Unicode filenames - Mergeall removals from __added__.txt are retried with "_" name mangling - Deltas save code was moved from __main__ to a function for readability - A "from mergeall import *" was used cut code redundancy radically - DIRDELTA was changed from in-file setting to command-line argument ===== ROLES ===== This script is the same as mergeall.py, but instead of updating TO, this script saves all new/recent changes in FROM to a separate folder. Because this folder has just recent changes, it's generally small, and can be quickly burned to optical disk after a longer full burn. Because changes are saved in the same format as Mergeall backups, they can also be automatically applied to TO later, using mergeall.py's "-restore" option. This makes TO the same as FROM just like a normal mergeall.py run, but the update is deferred and applied on demand. In 2021, this script also found a role in syncing content-copy changes to Android phones without microSD or POSIX USB access, by computing delta sets to proxy drives here, copying deltas to phone by file explorer, and merging them on the phone with mergeall.py's "-restore". For more on this use case: https://learning-python.com/fold3-vs-deltas.html#deltasworkaround. ===== USAGE ===== In general, mergeall.py's comparison-phase arguments all work here, but its resolution-phase arguments do not because no TO changes are applied, and an extra first argument gives the deltas path: [py[thon]] deltas.py dirdelta dirfrom dirto [-report] [-peek] [-quiet] [-skipcruft] Where: dirdelta => deltas save-folder pathname (extra argument here) dirfrom => source-tree pathname (this tree is never changed) dirto => destination-tree pathname (this tree is never changed) -report => report differences only and stop, making no changes -peek => check N start/stop bytes too when comparing same-named files -quiet => suppress Unicode-normalization messages during comparisons [3.3] -skipcruft => ignore cruft (a.k.a. metadata) files/dirs in both FROM and TO [3.0] This script is generally used in a two-run process: 1) COLLECTION To save but not apply changes, run this script from a command line like the following. This records all changes made to FROM (arg2) not yet in TO (arg3), in separate folder DIRDELTA (arg1): $ python3 deltas.py DIRDELTA FROM TO -skipcruft In this command, the extra first argument DIRDELTA gives the pathname to a folder where the deltas set is saved; this folder is automatically emptied and created if needed. mergeall.py's "-skipcruft" is useful here to ignore platform cruft (e.g., macOS ".DS_Store" files) in both initial comparisons and delta saves. As a more-concrete example: $ python3 deltas.py DIRDELTA ~/MY-STUFF /Volumes/SSDT7/MY-STUFF -skipcruft > ~/deltas-log.txt Compared to mergeall.py: the "-report" switch works in this script to preview differences before saving them, and is the same as running mergeall.py with the same flag. Conversely, "-auto" don't-ask mode isn't required or used here, because no TO updates are applied. For the same reason, "-backup", "-restore", and "-verify" are not used here; "-backup" is useful when mergeall.py is run later with "-restore", per the next section. 2) APPLICATION To later update TO for the set of changes recorded in the DIRDELTA save folder, copy TO to writeable media if required (e.g., see ./cpall.py), and run mergeall.py with its "-restore" option, listing TO as the merge target and DIRDELTA as the FROM source (where DIRDELTA may be either what it was initially or a location to which it was copied): $ python3 mergeall.py DIRDELTA TO -restore -auto -skipcruft -backup The "-restore" option was originally coded to rollback changes from backups, but works equally well to apply delta sets (for reasons explained in next section). An "-auto" here applies delta-set changes without pausing to verify each; "-skipcruft" ignores platform-specific files as usual; and "-backup" enables deltas-set rollbacks (undos) as covered in the ROLLBACKS section ahead. As another concrete example: $ python3 mergeall.py DIRDELTA /Volumes/SSDT7/MY-STUFF -restore -auto -skipcruft > ~/apply-log.txt For a demo and proof of this script's collection and application steps in action, see folder test/test-deltas-3.2/ in this package; unzip that folder's zipfile to rerun its tests live. Notes: mergeall.py's "-verify" flag doesn't make sense for delta-set apply runs (FROM is the deltas set, not the original FROM); run again in "-report" mode to verify changes. Also, some log errors are expected if the TO tree differed at deltas create and apply times, and backup-file sets may differ in the two trees normally unless one is a direct copy of the other; run mergeall.py post apply to analyze and sync as needed. ====== THEORY ====== This script finds all changes required to make folder TO (dirto) the same as FROM (dirfrom) as usual, but then saves all items that were changed in FROM (samefile) and added in FROM (uniquefrom), along with a list of items deleted in FROM (uniqueto). These are all saved in a third and separate folder, without changing TO. The resulting save folder serves the following two primary roles: 1) Archiving Changes The separate save folder can be used for archiving incremental changes made to FROM since the last update to TO as a distinct set, rather than merging the changes into TO as usual. This is useful for recording recent changes to large archives already saved to slow optical disk: burn a much smaller save folder as a supplement to the full burn. The save folder is stored in the same form as mergeall.py's __bkp__ tree-root backup folders, saved for mergeall.py's "-backup": - Changes and additions appear in the save folder at their full nested folder paths - The simple text file DIRDELTA/__added__.txt lists uniqueto paths, one per line In both cases, items recorded include both files and dirs, and all folder paths are relative to the TO root. The latter, __added__.txt, really reflects items deleted from FROM (there's nothing to delete in this context), but naming it __added__ allows for automatic application per the next section. 2) Applying Changes Besides their archiving role, the changes recorded in the separate save folder can also be later applied to TO to bring it up to date with FROM, by using mergeall.py's rollbacks feature--though the result is really a roll-forward here. To automatically apply the save folder's set of changes to TO, run mergeall.py with its "-restore" option, naming the save folder (DIRDELTA) as FROM and the original TO (or a copy of it) as TO: all the original FROM changes and additions will be copied from DIRDELTA to TO, and all items listed in DIRDELTA's __added__.txt will be removed from TO. The net effect makes TO the same as the original FROM. (Technically, TO will be made the same as FROM's state at the time DIRDELTA was created, assuming TO has not changed since the deltas were saved.) This works because applying a delta-changes set to make TO the same as a new FROM is equivalent to applying a backup-changes set to rollback TO to be the same as a former TO which is imaginary but is the same as the new FROM. This might just make your head explode, but suffices to apply DIRDELTA changes automatically. Notes: be sure to first copy TO to writable media to enable changes if needed (unwritable optical disk archives won't work as TO directly). Also, if you wish to apply changes with automated means other than mergeall.py's "-restore," be sure to apply deletes before adds (see mergeall.py's "SUBTLE THING" note for the, well, subtle reason for this; it has to do with renames on case-insensitive filesystems). For more on mergeall.py rollbacks (a.k.a. restores), see the following, either online at site https://learning-python.com/mergeall-products/unzipped/, or offline in your installed Mergeall folder: docetc/MoreDocs/Whitepaper.html#restores UserGuide.html#rollbacks Subtlety: mergeall.py's "-restore" mode also retries removals from __added__.txt which fail, after mangling nonportable filename characters to "_". This can be important for delta sets created on Unix but applied on Windows, because other tools (e.g., ziptools) may have changed characters the same way in the Windows copy. Mergeall in general assumes that FROM names are mangled as needed by unzips or other copies before syncs are run on Windows; __added__.txt lists evade this. To avoid this step, see and run fix-nonportable-filenames.py. Rollbacks are the mechanism used to apply delta sets made by this script, but they can also be used to undo delta-set changes, per the following section. ========= ROLLBACKS ========= Thanks to a minor mod in Mergeall's backup.removeprioradds(), it's now possible to rollback (undo) all changes made when applying a deltas set to TO, as long as you also use the mergeall.py "-backup" switch when applying the deltas with "-restore". That is, use commands: $ python3 mergeall.py DIRDELTA TO -restore -auto -skipcruft -backup to apply a delta set's changes per above and save backups $ python3 mergeall.py TO/__bkp__/ TO -restore -auto -skipcruft to rollback all changes made when applying the deltas set For the second of these, pass the path to the latest backups-set folder that the first command saves in the TO archive's __bkp__ folder as usual, per Mergeall's docs (e.g., docetc/MoreDocs/Whitepaper.html#restores). Deltas rollbacks _almost_ worked formerly if "-backup" was used: - Items changed and added in TO from a deltas set could be restored, because "-backup" backs up changes and notes additions as usual - Items listed in a delta set's __added__.txt and hence removed from TO could _not_ be restored, because removeprioradds() didn't save them To improve the second of these, removeprioradds() now always backs up the items it removes from TO, instead of simply deleting them. For deltas applied with "mergeall.py -restore -backup", this means that items which were unique in TO will be put back by a later run of "mergeall.py -restore", along with undoing TO replacements and additions. This is a full rollback. This removeprioradds() mod also makes it now possible to roll back true rollbacks made with "-restore -backup", restoring an archive to its former post-sync-pre-rollback state ('unrolled' for lack of a better term). This use case seems rare, but is supported at a small cost in extra backups size. Related change: "mergeall.py -restore" also now prints just one message for all unique TOs skipped, because this category includes most items in the archive during a deltas apply. This also reduces output volume during normal rollbacks, but deltas escalated this output from rare to common. Related note: Mergeall's rollback.py won't work for applying delta sets, because that script searches a set of backup folders for the latest to pass to a mergeall.py "-restore" run. For deltas, there is a single folder. ====== DESIGN ====== CODE: deltas applications (and rollbacks) could use custom code that simply removes __added__.txt items from TO and adds FROM items to TO, instead of being routed through normal syncs' logic. This would avoid a pointless comparison phase, make run logs simpler, and remove some convolution from mergeall.py's resolution-phase code. OTOH, it would also have to deal with much complexity separately and redundantly; unique FROM filenames, for example, would have to be checked for normalized Unicode equivalents in TO via filesystem probing. Using the usual mergeall.py comparison logic avoids this complexity. CODE: this file was originally a copy of mergeall.py, which replaced the resolution phase with custom code that changes TO to the save-folder's path. The code was eventually cleaned up; its redundancy was cut with a "from mergeall import *"; and its diffs are now marked with "#DELTAS". CONS: even after cutting its redundancy, this script is still logically divergent from mergeall.py, and its __main__ code overlaps significantly. Moreover, the mergeall.py "-restore" switch is now overloaded to apply to both delta applies and true rollbacks, and this seems complex in hindsight. This may have been addressed by adding delta saves to mergeall.py as a new "-deltas" mode, and/or generalizing backups and deltas to be instances of a new delta-sets paradigm applied with a new "-apply" switch. PROS: on the other hand, the code to apply deltas and rollbacks is the same, and Mergeall is too frozen to do any better today. Moreover, deltas seem logically disjoint, and may be better handled separately to avoid convolution of basic change propagations. For example, this script's command line differs substantially (see USAGE), and command-line patterns which depend on their own switches are confusing at best. GUI support would also require a new run mode (or two) and a new folder for deltas, and may be better stand alone. All of which is TBD, but this script works as is to achieve its main goal: to allow TO to be made the same as FROM both later and on demand. In contexts like archiving and phones, this minor twist is a major win. ================================================================================ """ ################### # CODE STARTS HERE ################### #------------------------------------------------------------------------------- # Py 2.X compatibility (also gets 2.X input() from mergeall.py). #------------------------------------------------------------------------------- from __future__ import print_function # print(), top of script #------------------------------------------------------------------------------- # This script is identical to mergeall.py, except for mods to the # __main__ logic, a new deltas-save function, and a custom getargs(). #------------------------------------------------------------------------------- from mergeall import * # use mergeall's globals directly, imports included import mergeall # to change mergeall.anyErrorsReported, [3.3] mar22 #------------------------------------------------------------------------------- # Use open() on Python 3.X and codecs.open() on 2.X; also callout # eolns because the latter is binary and doesn't auto-expand '\n'. #------------------------------------------------------------------------------- from backup import unicode_open, unicode_linesep from backup import ADDENC # encoding for __added__.txt: 'utf-8' from backup import indent1 # set off listing messages from others #------------------------------------------------------------------------------- # Use os.makedirs(exist_ok=True) on Python 3.X, catch exc on 2.X. # Also run paths through FWP() for very long pathnames on Windows. #------------------------------------------------------------------------------- from backup import makedirs_ifneeded #------------------------------------------------------------------------------- #DELTAS # BDRs don't record symlinks properly, and trigger OSErrors on macOS # during the comparison phase, which ends the program run prematurely. # This isn't unique to deltas, but was discovered in one of its use cases. # To fix, added a try/except skip in mergeall.comparelinks() directly, # instead of an original skip and monkey-patch here (now defunct). #------------------------------------------------------------------------------- """DEFUNCT def comparelinks(name, dirfrom, dirto, statfrom, statto, diffs): # now fixed in mergeall.py mergeall.comparelinks = comparelinks # monkey-patch hack DEFUNCT""" ################################################################################## # COMPARISON PHASE: collect FROM/TO differences - see mergeall.py ################################################################################## ################################################################################## # RESOLUTION PHASE: save changes for archiving and/or later appplication ################################################################################## def savedeltas(diffs, uniques, mixes, DIRDELTA, cmdargs_dirto, cmdargs_skipcruft): """ ------------------------------------------------------------------------- #DELTAS: Implementation of the deltas mod. The DIRDELTA folder has been cleared and created if needed before this is called. This code used to be embedded in __main__ as a prototype, but has been pulled out to be a function for better readability. It could become a mergeall.py '-deltas' argument and mode, but Mergeall is not scheduled for major code changes or app rereleases any time soon. This is called after comparetrees() has collected differences in diffs, uniques, and mixes. It replaces the normal mergetrees() step, which applies changes to TO, with custom code that instead saves changes made in FROM to the folder named by DIRDELTA. These saved changes are stored in a folder tree at paths relative to the cmdargs_dirto (TO) root, and in the same format as __bkp__ backups. They can be applied to TO separately and later, by running a 'mergeall.py -restore' per usage details at the top of this file. Pythons: this uses backup.py's unicode_open() instead of open() to support Python 2.X, though 2.X has reached it's "end of life" today and has been tested least, and supporting two differing lines is substantial extra work. When in doubt, use 3.X for this script. Subtlety: this cannot just join(DELTAS, name), because it needs to replace only the _prefix_ of a possibly much longer dirto path. Subtlety: the .replace() calls and archtail length calc here can be thrown off by a trailing / or \ in cmdargs_dirto for unnested items (only). It's now removed in getargs(), to avoid save errors and leftmost-char truncation in __added__.txt for top-level items. The comparison walker's +os.sep adds /, but join() drops one here. backup.noteaddition() had the issue too; backup.backupitem() didn't. Subtlety: the os.makedirs() run by makedirs_ifneeded() stamps all folders in a saved item's path with modtime = "now". This is fully irrelevant, because the only item copied to TO when deltas are applied is at the end of a path that already exists in TO - modtimes on the existing path in the deltas folder are unused in mergeall.py, and cpall.py copies correct modtimes to the new|changed item at the end. Folder modtimes may change on nested-item saves, but this is normal, and reflects the fact that a nested item was modified or added. Mergeall doesn't compare folder modtimes; they're nebulous at best. [3.3] The structure of the difference lists created by mergeall.py's comparison phase has changed slightly to accommodate its new Unicode normalization of filenames; use the new structures here too. These lists' namefrom/nameto filenames are the original and unnormalized forms which may differ if they were normalized for comparison, but this is largely moot for the deltas create here: nameto may differ again and arbitrarily when the saved deltas are later applied with mergeall.py's -restore, which will normalize for comparisons again. Deletion paths in __added__.txt will also be morphed for a later TO. Note: the coding here assumes dirtos can be stored on DELTAS device. [3.3] mar22: must change mergeall.anyErrorsReported here, not global anyErrorsReported. The former is checked in mergeall.summaryreport() as a global, which means in mergeall's module, not this module. Using 'global' here means in this (deltas) module only. This flag was also formerly unset in mergeall.py (its 'global' was accidentally moved into a docstring after 2017), and cpall's version lacked 'global' altogether. ------------------------------------------------------------------------- """ global countresolve # not required, but polite join = os.path.join # not required, but concise split = os.path.split # not required, but symmetric def error(message, *args): """ [COPIED] Standard message format + exception data? ([1.7.1] show message too!). """ # global anyErrorsReported # [3.3] mar22: change in mergeall module! mergeall.anyErrorsReported = True # [3.0] for summary line print('**Error', message, *args) trace(1, sys.exc_info()[0], sys.exc_info()[1]) #------------------------------------------------------------------------- # 1) Samefile -> changed in FROM => copy FROM to deltas # # - Will overwrite older in TO when deltas applied... as if was replaced # - These are same-named files|links that differ by modtimes or linkpaths # - Deltas~TO will still differ when mergeall.py -restore is run later # - [3.3] namefrom/nameto may differ here, but may differ again on apply #------------------------------------------------------------------------- for (namefrom, nameto, dirfrom, dirto, why) in diffs: pathfrom, pathto = join(dirfrom, namefrom), join(dirto, nameto) pathto = pathto.replace(cmdargs_dirto, DIRDELTA) try: head, tail = split(pathto) makedirs_ifneeded(head) copyfile(pathfrom, pathto) # content + modtime # no __bkp__ made except: error('saving changed FROM file: skipped', pathfrom) else: countresolve.files.replaced += 1 trace(1, 'saved changed FROM file,', pathfrom) # not 'replaced' #------------------------------------------------------------------------- # 2) Uniquefrom -> added to FROM => copy FROM to deltas # # - Will be added to TO when deltas applied... as if was removed # - These are any type of item added to FROM since the latest sync # - These will still be unique in FROM (deltas) on later -restore run # - [3.3] namefrom is in the FROM tree, but may differ again on apply #------------------------------------------------------------------------- for (uniqs, dirfrom, dirto) in uniques['from']: for namefrom in uniqs: pathfrom, pathto = join(dirfrom, namefrom), join(dirto, namefrom) pathto = pathto.replace(cmdargs_dirto, DIRDELTA) if os.path.isfile(FWP(pathfrom)) or os.path.islink(FWP(pathfrom)): try: head, tail = split(pathto) makedirs_ifneeded(head) copyfile(pathfrom, pathto) except: error('saving new FROM file: skipped', pathfrom) else: countresolve.files.created += 1 trace(1, 'saved new FROM file,', pathfrom) elif os.path.isdir(FWP(pathfrom)): try: head, tail = split(pathto) makedirs_ifneeded(head) os.mkdir(FWP(pathto)) copytree(pathfrom, pathto, skipcruft=cmdargs_skipcruft) except: error('saving new FROM dir: skipped', pathfrom) else: countresolve.folders.created += 1 trace(1, 'saved new FROM dir, ', pathfrom) else: trace(1, 'ignored unknown unique type in FROM:', pathfrom) #------------------------------------------------------------------------- # 3) Uniqueto -> removed from FROM => note TO in deltas __added__.txt # # - Will be deleted from TO when deltas applied... as if was added # - These are any type of item deleted from FROM since the latest sync # - These may also be renames, along with a new entry in uniquefrom # - These will still be present in TO on later mergeall.py -restore run # - [3.3] nameto is in the TO tree, but may differ again on apply #------------------------------------------------------------------------- addedpath = join(DIRDELTA, '__added__.txt') deleteds = unicode_open(addedpath, mode='w', encoding=ADDENC) for (uniqs, dirfrom, dirto) in uniques['to']: # dirfrom unused here for nameto in uniqs: pathto = join(dirto, nameto) # note relative to TO root for -restores archtail = pathto[(len(cmdargs_dirto) + len(os.sep)):] if os.path.isfile(FWP(pathto)) or os.path.islink(FWP(pathto)): countresolve.files.deleted += 1 deleteds.write(archtail + unicode_linesep) trace(1, indent1 + 'listed removed TO file:', archtail) elif os.path.isdir(FWP(pathto)): countresolve.folders.deleted += 1 deleteds.write(archtail + unicode_linesep) trace(1, indent1 + 'listed removed TO dir:', archtail) # or bogus message line in deleteds to defer to -restore? else: trace(1, 'ignored unknown unique type in TO:', pathto) deleteds.close() if uniques['to']: # sans unknowns, at least numto = sum(len(uniqs) for (uniqs, dirfrom, dirto) in uniques['to']) trace(1, indent1 + 'listed %d TO item(s) in deltas __added__.txt' % numto) #------------------------------------------------------------------------- # 4) Mixes -> changed type in FROM => copy FROM's version to deltas # # - Will overwrite version in TO when deltas applied... as if was replaced # - Rare, but may occur if a file was changed to a folder, or vice versa # - May also be symlink-vs-stubfile, though moot unless modtimes differ # - Deltas~TO will still be mixed types on later mergeall.py -restore run # - Caveat: this code is identical to case #2 above, sans top-level loops # and messages text, but retained to make this case explicit/standalone # - [3.3] namefrom/nameto may differ here, but may differ again on apply #------------------------------------------------------------------------- for (namefrom, nameto, dirfrom, dirto) in mixes: pathfrom, pathto = join(dirfrom, namefrom), join(dirto, nameto) pathto = pathto.replace(cmdargs_dirto, DIRDELTA) if os.path.isfile(FWP(pathfrom)) or os.path.islink(FWP(pathfrom)): try: head, tail = split(pathto) makedirs_ifneeded(head) copyfile(pathfrom, pathto) except: error('saving mixed FROM file: skipped', pathfrom) else: countresolve.files.created += 1 trace(1, 'saved mixed FROM file,', pathfrom) elif os.path.isdir(FWP(pathfrom)): try: head, tail = split(pathto) makedirs_ifneeded(head) os.mkdir(FWP(pathto)) copytree(pathfrom, pathto, skipcruft=cmdargs_skipcruft) except: error('saving mixed FROM dir: skipped', pathfrom) else: countresolve.folders.created += 1 trace(1, 'saved mixed FROM dir, ', pathfrom) else: trace(1, 'ignored unknown mixed type in FROM:', pathfrom) ################################################################################## # UTILITIES ################################################################################## def getargs(): """ --------------------------------------------------------------------------- Get command-line arguments, return False if any are invalid. #DELTAS: commands totally differ from mergeall.py - use a custom function here that replaces the original version imported from mergeall.py. Expect and use an extra first argument: the path name to the folder where the deltas should be stored (and auto clean/make this folder if needed). Also trim mergeall.py args that apply to a resolution phase not run here. [3.2] This now drops trailing / or \ on folder args, if any. Else, they wreak havoc with later path .replace() calls and archtail length calcs. [3.3] Add -quiet to omit Unicode normalization messages during comparison phase (there may be many). -quiet is also used for resolution backups, and __added__.txt removals when -restore mode morphs paths to match TO. [3.3] Support Windows long paths on machines that need it by using FWP() everywhere here (e.g., an existing DELTAS folder may be arbitrarily deep). backup.rmtreeworkaround() handles some errors, but does not FWP() itself. --------------------------------------------------------------------------- """ def usageerror(message): """ Display usage, show script's docs? """ print('**%s' % message) print('deltas run cancelled.') print('Usage:\n' '\t[py[thon]] deltas.py dirdeltas dirfrom dirto\n' '\t\t[-report]\n' '\t\t[-peek]\n' '\t\t[-quiet]\n' '\t\t[-skipcruft]') if sys.stdin.isatty() and sys.stdout.isatty(): if input('More?') in ['y', 'yes']: # [2.0] for shell, not pipe try: help('deltas') # never used by launchers except NameError: # and absent in frozen exe [3.3] print('help unavailable in this package') def initdeltasfolder(dirdelta): """ Clean+create deltas folder, but not if -report. End script now (before comparison phase) on any fail here. """ try: if '-report' not in sys.argv: # don't make folder if -report if os.path.exists(FWP(dirdelta)): # allow Windows longpaths [3.3] # rm existing deltas if os.path.isfile(FWP(dirdelta)): os.remove(FWP(dirdelta)) # end script now on file errors else: shutil.rmtree(FWP(dirdelta, force=True), onerror=backup.rmtreeworkaround) # make new deltas os.mkdir(FWP(dirdelta)) trace(1, 'Saving all deltas to:', dirdelta) return True except Exception as E: return False class cmdargs: pass # a set of attributes try: # required args cmdargs.dirdelta = sys.argv[1].rstrip(os.sep) # [3.2] drop trailing / or \ cmdargs.dirfrom = sys.argv[2].rstrip(os.sep) # else bad len and replace cmdargs.dirto = sys.argv[3].rstrip(os.sep) except: usageerror('Missing dirdelta, dirfrom, or dirto paths') return False else: if not initdeltasfolder(cmdargs.dirdelta): usageerror('Could not resolve deltas-save folder path') return False if not os.path.isdir(FWP(cmdargs.dirfrom)): usageerror('Invalid dirfrom directory path') return False elif not os.path.isdir(FWP(cmdargs.dirto)): usageerror('Invalid dirto directory path') return False else: # optional args options = ['-report', '-peek', '-quiet', '-skipcruft'] # fewer switches for option in options: setattr(cmdargs, option[1:], False) for option in sys.argv[4:]: if option in options: setattr(cmdargs, option[1:], True) else: usageerror('Bad command-line option: "%s"' % option) return False return cmdargs # this class is True ################################################################################## # MAIN LOGIC ################################################################################## if __name__ == '__main__': trace(1, 'deltas %.1f starting' % VERSION) # Mergeall version import time gettime = (time.perf_counter if hasattr(time, 'perf_counter') else (time.clock if RunningOnWindows else time.time)) # get and verify parameters from command line cmdargs = getargs() if not cmdargs: sys.exit(1) #--------------------------------------------------------------------------- # COMPARISON PHASE: collect differences #--------------------------------------------------------------------------- trace(1, '-' * 79, '\n*Collecting tree differences') if cmdargs.skipcruft: trace(1, 'Skipping system cruft (metadata) files in both FROM and TO') diffs = [] uniques = {'from': [], 'to': []} # lists/dict changed in-place by walker mixes = [] starttime = gettime() try: comparetrees(cmdargs.dirfrom, cmdargs.dirto, # from/to roots diffs, uniques, mixes, # noted differences cmdargs.peek, # file reads? cmdargs.skipcruft, # exclude cruft files [3.0] cmdargs.quiet, # omit normalization msgs [3.3] skips=['__bkp__', '__added__.txt']) # exclude top-level specials [2.0] #DELTAS: reworded message for deltas except Exception as Why: # [3.0] friendlier message on comparison failure exits print('**Error during comparison phase\n' '...The deltas run was terminated by a folder comparisons error, to\n' '...avoid a partial changes set. No deltas were saved. Please resolve\n' '...the following Python exception before rerunning deltas.py against\n' '...the same folders:') print(Why.__class__.__name__, Why) print('\n...A detailed Python traceback follows:') import traceback traceback.print_exc() sys.exit(1) else: trace(1, 'Phase runtime:', gettime() - starttime) # [2.2] time phases trace(1, '-' * 79, '\n*Reporting tree differences') reportdiffs(diffs, uniques, mixes, dorestore=False) # handles own exceptions [3.3] if cmdargs.report: # report and exit summaryreport(diffs, uniques, mixes, deltas=True) # show totals [2.0] [3.3] sys.exit(0) #--------------------------------------------------------------------------- # RESOLUTION PHASE: reconcile differences #--------------------------------------------------------------------------- trace(1, '-' * 79, '\n*Resolving tree differences') if cmdargs.skipcruft: trace(1, 'Skipping system cruft (metadata) files in FROM folders') starttime = gettime() #DELTAS: replace TO updates with deltas saves """CUTCUTCUT mergetrees(diffs, uniques, mixes, # noted differences cmdargs.auto, # make changes? else ask cmdargs.backup, cmdargs.dirto, # save items replaced/removed [2.0] cmdargs.restore, cmdargs.dirfrom, # keep unique TO, undo adds [2.1] cmdargs.quiet, # suppress backing-up messages [2.4] cmdargs.skipcruft) # skip cruft files in copytree [3.0] CUTCUTCUT""" savedeltas(diffs, uniques, mixes, # noted differences cmdargs.dirdelta, cmdargs.dirto, # deltas and TO folders cmdargs.skipcruft) # skipcruft in tree copies trace(1, 'Phase runtime:', gettime() - starttime) # [2.2] time phases #DELTAS: the verify phase makes no sense here - no TO changes are made """CUTCUTCUT if cmdargs.verify: # post verify step trace(1, '-' * 79 + '\n*Diffall run follows\n' + '-' * 79) starttime = gettime() cmd = os.popen('diffall.py %s %s' % (cmdargs.dirfrom, cmdargs.dirto)) for line in cmd: print(line, end='') # or save to a file? trace(1, 'Phase runtime:', gettime() - starttime) # [2.2] time phases CUTCUTCUT""" # run in mergeall, with its global summaryreport(diffs, uniques, mixes, deltas=True) # show totals [2.0] [3.3]