File: mergeall-products/unzipped/deltas.py

#!/usr/bin/env python3
# -*- coding: utf8 -*-
r"""
================================================================================
deltas.py: a fork of mergeall.py for saving delta sets [3.2]

To make deltas:
    $ python3 deltas.py DIRDELTA FROM TO -skipcruft -quiet

To apply deltas:
    $ python3 mergeall.py DIRDELTA TO -restore -auto -skipcruft -backup -quiet

Synopsis: this mergeall.py variant saves changes (a.k.a. deltas) made to
FROM separately, instead of applying them to TO immediately.  The changes
saved in the DIRDELTA folder may be archived, and may be used to bring a 
TO content copy in sync later using mergeall.py's "-restore" option.

Version: now part of all Mergeall packages [3.3]
License: provided freely, but with no warranties of any kind
Author:  © M. Lutz 2019-2021 (learning-python.com)
Web:     learning-python.com/{mergeall.html, fold3-vs-deltas.html}
Demo:    test/test-deltas-3.2
Hosts:   runs on macOS, Windows, Linux, Android, and others 
Python:  runs on 3.X and 2.X (e.g., 3.9 and 2.7), 3.X strongly preferred 


======
STATUS
======

[3.3] For Mergeall 3.3, this script was updated for new difference-list
structures, and their unnormalized Unicode from/to filenames.  For more 
on this change, see fixunicodedups.py's top-of-file docstring.  With 3.3,
this script also issues a message for each __added__.txt item, as "listed",
and omits normalization messages for non-NFC filenames if passed "-quiet".

This script was initially released only in Mergeall's source-code package
but is now in all packages as of the Feb-2022 rebuild.  To install, fetch 
Mergeall's source-code or other packages (learning-python.com/mergeall.html)
and unzip; this script is in the top-level Mergeall-source/ folder in the
source code, and available as a frozen executable in other packages.

This script is a fork, both because its command-line pattern differs 
from mergeall.py's (it requires an additional argument, and doesn't 
support others), and because mergeall.py is largely frozen to major
changes.  To accommodate this script, a handful of smaller mods were 
applied to Mergeall's base code, including:

- mergeall.py's comparelinks() 
      was augmented to skip read errors, an issue for BDRs uncovered here

- mergeall.py's mergetrees()
      prints just one message for unique TOs skipped during -restore runs

- mergeall.py's excludeskips()
      was modified to skip __added__.txt in FROM so it's not propagated

- backup.py's removeprioradds() 
      now backs up its deletions, so delta-set updates can be rolled back

- backup.py's removeprioradds() 
      now mangles nonportable characters in __added__.txt names on Windows

- backup.py's noteaddition()
      was fixed for the same trailing-slashes + unnested-backups issue

Search for "[3.2]" in the above files to locate all base-code changes.
Also search for '#DELTAS' in this file to find all changes applied in 
this fork itself.  The latest mods here, most-recent first:

- Pushed out to app and executable packages with the Feb-2022 rebuild
- Fix for folder-path args with trailing slashes and unnested content
- Verified on Python 2.X, though 3.X is preferred for Unicode filenames
- Mergeall removals from __added__.txt are retried with "_" name mangling
- Deltas save code was moved from __main__ to a function for readability
- A "from mergeall import *" was used cut code redundancy radically
- DIRDELTA was changed from in-file setting to command-line argument 


=====
ROLES
=====

This script is the same as mergeall.py, but instead of updating TO, this script
saves all new/recent changes in FROM to a separate folder.  Because this folder 
has just recent changes, it's generally small, and can be quickly burned to 
optical disk after a longer full burn.

Because changes are saved in the same format as Mergeall backups, they can also
be automatically applied to TO later, using mergeall.py's "-restore" option.
This makes TO the same as FROM just like a normal mergeall.py run, but the
update is deferred and applied on demand.

In 2021, this script also found a role in syncing content-copy changes to 
Android phones without microSD or POSIX USB access, by computing delta sets 
to proxy drives here, copying deltas to phone by file explorer, and merging 
them on the phone with mergeall.py's "-restore".  For more on this use case:
https://learning-python.com/fold3-vs-deltas.html#deltasworkaround.


=====
USAGE
=====

In general, mergeall.py's comparison-phase arguments all work here, but its
resolution-phase arguments do not because no TO changes are applied, and an
extra first argument gives the deltas path:

  [py[thon]] deltas.py dirdelta dirfrom dirto
                [-report]
                [-peek]
                [-quiet]
                [-skipcruft]

Where:
  dirdelta   => deltas save-folder pathname   (extra argument here)
  dirfrom    => source-tree pathname          (this tree is never changed)
  dirto      => destination-tree pathname     (this tree is never changed)
  -report    => report differences only and stop, making no changes
  -peek      => check N start/stop bytes too when comparing same-named files
  -quiet     => suppress Unicode-normalization messages during comparisons [3.3]
  -skipcruft => ignore cruft (a.k.a. metadata) files/dirs in both FROM and TO [3.0]

This script is generally used in a two-run process:


1) COLLECTION

To save but not apply changes, run this script from a command line like
the following.  This records all changes made to FROM (arg2) not yet in
TO (arg3), in separate folder DIRDELTA (arg1):

  $ python3 deltas.py DIRDELTA FROM TO -skipcruft

In this command, the extra first argument DIRDELTA gives the pathname 
to a folder where the deltas set is saved; this folder is automatically
emptied and created if needed.  mergeall.py's "-skipcruft" is useful 
here to ignore platform cruft (e.g., macOS ".DS_Store" files) in both 
initial comparisons and delta saves.  As a more-concrete example:

  $ python3 deltas.py DIRDELTA ~/MY-STUFF /Volumes/SSDT7/MY-STUFF -skipcruft > ~/deltas-log.txt

Compared to mergeall.py: the "-report" switch works in this script to preview
differences before saving them, and is the same as running mergeall.py with 
the same flag.  Conversely, "-auto" don't-ask mode isn't required or used 
here, because no TO updates are applied.  For the same reason, "-backup",
"-restore", and "-verify" are not used here; "-backup" is useful when 
mergeall.py is run later with "-restore", per the next section.


2) APPLICATION

To later update TO for the set of changes recorded in the DIRDELTA save 
folder, copy TO to writeable media if required (e.g., see ./cpall.py), and
run mergeall.py with its "-restore" option, listing TO as the merge target
and DIRDELTA as the FROM source (where DIRDELTA may be either what it was 
initially or a location to which it was copied):

  $ python3 mergeall.py DIRDELTA TO -restore -auto -skipcruft -backup

The "-restore" option was originally coded to rollback changes from backups,
but works equally well to apply delta sets (for reasons explained in next 
section).  An "-auto" here applies delta-set changes without pausing to 
verify each; "-skipcruft" ignores platform-specific files as usual; and 
"-backup" enables deltas-set rollbacks (undos) as covered in the ROLLBACKS
section ahead.  As another concrete example:

  $ python3 mergeall.py DIRDELTA /Volumes/SSDT7/MY-STUFF -restore -auto -skipcruft > ~/apply-log.txt

For a demo and proof of this script's collection and application steps
in action, see folder test/test-deltas-3.2/ in this package; unzip that
folder's zipfile to rerun its tests live.

Notes: mergeall.py's "-verify" flag doesn't make sense for delta-set apply
runs (FROM is the deltas set, not the original FROM); run again in "-report" 
mode to verify changes.  Also, some log errors are expected if the TO tree 
differed at deltas create and apply times, and backup-file sets may differ 
in the two trees normally unless one is a direct copy of the other; run 
mergeall.py post apply to analyze and sync as needed.


======
THEORY
======

This script finds all changes required to make folder TO (dirto) the same as
FROM (dirfrom) as usual, but then saves all items that were changed in FROM
(samefile) and added in FROM (uniquefrom), along with a list of items deleted
in FROM (uniqueto).  These are all saved in a third and separate folder, without
changing TO.  The resulting save folder serves the following two primary roles:


1) Archiving Changes

The separate save folder can be used for archiving incremental changes made 
to FROM since the last update to TO as a distinct set, rather than merging 
the changes into TO as usual.  This is useful for recording recent changes to
large archives already saved to slow optical disk: burn a much smaller save 
folder as a supplement to the full burn.

The save folder is stored in the same form as mergeall.py's __bkp__ tree-root 
backup folders, saved for mergeall.py's "-backup":

- Changes and additions appear in the save folder at their full nested folder paths
- The simple text file DIRDELTA/__added__.txt lists uniqueto paths, one per line

In both cases, items recorded include both files and dirs, and all folder paths
are relative to the TO root.  The latter, __added__.txt, really reflects items 
deleted from FROM (there's nothing to delete in this context), but naming it
__added__ allows for automatic application per the next section.


2) Applying Changes

Besides their archiving role, the changes recorded in the separate save folder
can also be later applied to TO to bring it up to date with FROM, by using 
mergeall.py's rollbacks feature--though the result is really a roll-forward here. 

To automatically apply the save folder's set of changes to TO, run mergeall.py
with its "-restore" option, naming the save folder (DIRDELTA) as FROM and the
original TO (or a copy of it) as TO: all the original FROM changes and additions
will be copied from DIRDELTA to TO, and all items listed in DIRDELTA's __added__.txt
will be removed from TO.  The net effect makes TO the same as the original FROM.
(Technically, TO will be made the same as FROM's state at the time DIRDELTA 
was created, assuming TO has not changed since the deltas were saved.)

This works because applying a delta-changes set to make TO the same as a new FROM 
is equivalent to applying a backup-changes set to rollback TO to be the same as a
former TO which is imaginary but is the same as the new FROM.  This might just
make your head explode, but suffices to apply DIRDELTA changes automatically.

Notes: be sure to first copy TO to writable media to enable changes if needed 
(unwritable optical disk archives won't work as TO directly).  Also, if you wish
to apply changes with automated means other than mergeall.py's "-restore," be sure 
to apply deletes before adds (see mergeall.py's "SUBTLE THING" note for the, well, 
subtle reason for this; it has to do with renames on case-insensitive filesystems).

For more on mergeall.py rollbacks (a.k.a. restores), see the following, either
online at site https://learning-python.com/mergeall-products/unzipped/, or 
offline in your installed Mergeall folder:

  docetc/MoreDocs/Whitepaper.html#restores
  UserGuide.html#rollbacks

Subtlety: mergeall.py's "-restore" mode also retries removals from __added__.txt
which fail, after mangling nonportable filename characters to "_".  This can be
important for delta sets created on Unix but applied on Windows, because
other tools (e.g., ziptools) may have changed characters the same way in the 
Windows copy.  Mergeall in general assumes that FROM names are mangled as needed
by unzips or other copies before syncs are run on Windows; __added__.txt lists 
evade this.  To avoid this step, see and run fix-nonportable-filenames.py.

Rollbacks are the mechanism used to apply delta sets made by this script, but
they can also be used to undo delta-set changes, per the following section.


=========
ROLLBACKS
=========

Thanks to a minor mod in Mergeall's backup.removeprioradds(), it's now
possible to rollback (undo) all changes made when applying a deltas 
set to TO, as long as you also use the mergeall.py "-backup" switch 
when applying the deltas with "-restore".  That is, use commands:

$ python3 mergeall.py DIRDELTA TO -restore -auto -skipcruft -backup
    to apply a delta set's changes per above and save backups

$ python3 mergeall.py TO/__bkp__/<backup> TO -restore -auto -skipcruft
    to rollback all changes made when applying the deltas set

For the second of these, pass the path to the latest backups-set folder
that the first command saves in the TO archive's __bkp__ folder as usual,
per Mergeall's docs (e.g., docetc/MoreDocs/Whitepaper.html#restores).

Deltas rollbacks _almost_ worked formerly if "-backup" was used:

- Items changed and added in TO from a deltas set could be restored, 
  because "-backup" backs up changes and notes additions as usual

- Items listed in a delta set's __added__.txt and hence removed from TO
  could _not_ be restored, because removeprioradds() didn't save them

To improve the second of these, removeprioradds() now always backs up the
items it removes from TO, instead of simply deleting them.  For deltas 
applied with "mergeall.py -restore -backup", this means that items which 
were unique in TO will be put back by a later run of "mergeall.py -restore",
along with undoing TO replacements and additions.  This is a full rollback.

This removeprioradds() mod also makes it now possible to roll back true 
rollbacks made with "-restore -backup", restoring an archive to its former 
post-sync-pre-rollback state ('unrolled' for lack of a better term).  This 
use case seems rare, but is supported at a small cost in extra backups size.

Related change: "mergeall.py -restore" also now prints just one message
for all unique TOs skipped, because this category includes most items in 
the archive during a deltas apply.  This also reduces output volume during 
normal rollbacks, but deltas escalated this output from rare to common.

Related note: Mergeall's rollback.py won't work for applying delta sets,
because that script searches a set of backup folders for the latest to pass
to a mergeall.py "-restore" run.  For deltas, there is a single folder.  


======
DESIGN
======

CODE: deltas applications (and rollbacks) could use custom code that
simply removes __added__.txt items from TO and adds FROM items to TO, 
instead of being routed through normal syncs' logic.  This would avoid
a pointless comparison phase, make run logs simpler, and remove some 
convolution from mergeall.py's resolution-phase code.  OTOH, it would
also have to deal with much complexity separately and redundantly;
unique FROM filenames, for example, would have to be checked for
normalized Unicode equivalents in TO via filesystem probing.  Using
the usual mergeall.py comparison logic avoids this complexity.
 
CODE: this file was originally a copy of mergeall.py, which replaced the 
resolution phase with custom code that changes TO to the save-folder's 
path.  The code was eventually cleaned up; its redundancy was cut with a
"from mergeall import *"; and its diffs are now marked with "#DELTAS".

CONS: even after cutting its redundancy, this script is still logically
divergent from mergeall.py, and its __main__ code overlaps significantly.
Moreover, the mergeall.py "-restore" switch is now overloaded to apply to 
both delta applies and true rollbacks, and this seems complex in hindsight.
This may have been addressed by adding delta saves to mergeall.py as a new 
"-deltas" mode, and/or generalizing backups and deltas to be instances of 
a new delta-sets paradigm applied with a new "-apply" switch.

PROS: on the other hand, the code to apply deltas and rollbacks is the same,
and Mergeall is too frozen to do any better today.  Moreover, deltas seem 
logically disjoint, and may be better handled separately to avoid convolution
of basic change propagations.  For example, this script's command line differs
substantially (see USAGE), and command-line patterns which depend on their own
switches are confusing at best.  GUI support would also require a new run 
mode (or two) and a new folder for deltas, and may be better stand alone.

All of which is TBD, but this script works as is to achieve its main goal: 
to allow TO to be made the same as FROM both later and on demand.  In 
contexts like archiving and phones, this minor twist is a major win.
================================================================================
"""




###################
# CODE STARTS HERE
###################



#-------------------------------------------------------------------------------
# Py 2.X compatibility (also gets 2.X input() from mergeall.py).
#-------------------------------------------------------------------------------

from __future__ import print_function    # print(), top of script


#-------------------------------------------------------------------------------
# This script is identical to mergeall.py, except for mods to the 
# __main__ logic, a new deltas-save function, and a custom getargs().
#-------------------------------------------------------------------------------

from mergeall import *        # use mergeall's globals directly, imports included
import mergeall               # to change mergeall.anyErrorsReported, [3.3] mar22


#-------------------------------------------------------------------------------
# Use open() on Python 3.X and codecs.open() on 2.X; also callout 
# eolns because the latter is binary and doesn't auto-expand '\n'.
#-------------------------------------------------------------------------------

from backup import unicode_open, unicode_linesep

from backup import ADDENC     # encoding for __added__.txt: 'utf-8'
from backup import indent1    # set off listing messages from others 


#-------------------------------------------------------------------------------
# Use os.makedirs(exist_ok=True) on Python 3.X, catch exc on 2.X.
# Also run paths through FWP() for very long pathnames on Windows.
#-------------------------------------------------------------------------------

from backup import makedirs_ifneeded


#-------------------------------------------------------------------------------
#DELTAS
# BDRs don't record symlinks properly, and trigger OSErrors on macOS
# during the comparison phase, which ends the program run prematurely.
# This isn't unique to deltas, but was discovered in one of its use cases.
# To fix, added a try/except skip in mergeall.comparelinks() directly, 
# instead of an original skip and monkey-patch here (now defunct).
#-------------------------------------------------------------------------------

"""DEFUNCT
def comparelinks(name, dirfrom, dirto, statfrom, statto, diffs):
    # now fixed in mergeall.py

mergeall.comparelinks = comparelinks    # monkey-patch hack
DEFUNCT"""




##################################################################################
# COMPARISON PHASE: collect FROM/TO differences - see mergeall.py
##################################################################################




##################################################################################
# RESOLUTION PHASE: save changes for archiving and/or later appplication
##################################################################################




def savedeltas(diffs, uniques, mixes, DIRDELTA, cmdargs_dirto, cmdargs_skipcruft):
    """
    -------------------------------------------------------------------------
    #DELTAS: Implementation of the deltas mod.  The DIRDELTA folder 
    has been cleared and created if needed before this is called.

    This code used to be embedded in __main__ as a prototype, but has 
    been pulled out to be a function for better readability.  It could 
    become a mergeall.py '-deltas' argument and mode, but Mergeall is not
    scheduled for major code changes or app rereleases any time soon.

    This is called after comparetrees() has collected differences in 
    diffs, uniques, and mixes.  It replaces the normal mergetrees() 
    step, which applies changes to TO, with custom code that instead 
    saves changes made in FROM to the folder named by DIRDELTA.  
    
    These saved changes are stored in a folder tree at paths relative 
    to the cmdargs_dirto (TO) root, and in the same format as __bkp__ 
    backups.  They can be applied to TO separately and later, by running 
    a 'mergeall.py -restore' per usage details at the top of this file.

    Pythons: this uses backup.py's unicode_open() instead of open() to
    support Python 2.X, though 2.X has reached it's "end of life" today
    and has been tested least, and supporting two differing lines is 
    substantial extra work.  When in doubt, use 3.X for this script.

    Subtlety: this cannot just join(DELTAS, name), because it needs
    to replace only the _prefix_ of a possibly much longer dirto path.

    Subtlety: the .replace() calls and archtail length calc here can 
    be thrown off by a trailing / or \ in cmdargs_dirto for unnested 
    items (only).  It's now removed in getargs(), to avoid save errors 
    and leftmost-char truncation in __added__.txt for top-level items.  
    The comparison walker's +os.sep adds /, but join() drops one here.
    backup.noteaddition() had the issue too; backup.backupitem() didn't.

    Subtlety: the os.makedirs() run by makedirs_ifneeded() stamps all 
    folders in a saved item's path with modtime = "now".  This is fully
    irrelevant, because the only item copied to TO when deltas are applied 
    is at the end of a path that already exists in TO - modtimes on the 
    existing path in the deltas folder are unused in mergeall.py, and  
    cpall.py copies correct modtimes to the new|changed item at the end.
    Folder modtimes may change on nested-item saves, but this is normal,
    and reflects the fact that a nested item was modified or added.
    Mergeall doesn't compare folder modtimes; they're nebulous at best. 

    [3.3] The structure of the difference lists created by mergeall.py's
    comparison phase has changed slightly to accommodate its new Unicode
    normalization of filenames; use the new structures here too.  These
    lists' namefrom/nameto filenames are the original and unnormalized
    forms which may differ if they were normalized for comparison, but 
    this is largely moot for the deltas create here: nameto may differ
    again and arbitrarily when the saved deltas are later applied with 
    mergeall.py's -restore, which will normalize for comparisons again.
    Deletion paths in __added__.txt will also be morphed for a later TO.
    Note: the coding here assumes dirtos can be stored on DELTAS device.

    [3.3] mar22: must change mergeall.anyErrorsReported here, not global
    anyErrorsReported.  The former is checked in mergeall.summaryreport()
    as a global, which means in mergeall's module, not this module.  Using 
    'global' here means in this (deltas) module only.  This flag was also 
    formerly unset in mergeall.py (its 'global' was accidentally moved into
    a docstring after 2017), and cpall's version lacked 'global' altogether.
    -------------------------------------------------------------------------
    """

    global countresolve      # not required, but polite
    join  = os.path.join     # not required, but concise
    split = os.path.split    # not required, but symmetric


    def error(message, *args):
        """
        [COPIED] Standard message format + exception data? ([1.7.1] show message too!).
        """
        # global anyErrorsReported          # [3.3] mar22: change in mergeall module!
        mergeall.anyErrorsReported = True   # [3.0] for summary line

        print('**Error', message, *args)
        trace(1, sys.exc_info()[0], sys.exc_info()[1])


    #-------------------------------------------------------------------------
    # 1) Samefile -> changed in FROM => copy FROM to deltas
    #
    # - Will overwrite older in TO when deltas applied... as if was replaced
    # - These are same-named files|links that differ by modtimes or linkpaths
    # - Deltas~TO will still differ when mergeall.py -restore is run later
    # - [3.3] namefrom/nameto may differ here, but may differ again on apply
    #-------------------------------------------------------------------------

    for (namefrom, nameto, dirfrom, dirto, why) in diffs:
        pathfrom, pathto = join(dirfrom, namefrom), join(dirto, nameto)
        pathto = pathto.replace(cmdargs_dirto, DIRDELTA)
        try:
            head, tail = split(pathto)
            makedirs_ifneeded(head)
            copyfile(pathfrom, pathto)   # content + modtime      # no __bkp__ made
        except:
            error('saving changed FROM file: skipped', pathfrom)
        else:
            countresolve.files.replaced += 1
            trace(1, 'saved changed FROM file,', pathfrom)        # not 'replaced'


    #-------------------------------------------------------------------------
    # 2) Uniquefrom -> added to FROM => copy FROM to deltas
    #
    # - Will be added to TO when deltas applied... as if was removed
    # - These are any type of item added to FROM since the latest sync
    # - These will still be unique in FROM (deltas) on later -restore run
    # - [3.3] namefrom is in the FROM tree, but may differ again on apply
    #-------------------------------------------------------------------------

    for (uniqs, dirfrom, dirto) in uniques['from']:
        for namefrom in uniqs:
            pathfrom, pathto = join(dirfrom, namefrom), join(dirto, namefrom)
            pathto = pathto.replace(cmdargs_dirto, DIRDELTA)
            
            if os.path.isfile(FWP(pathfrom)) or os.path.islink(FWP(pathfrom)):
                try:
                    head, tail = split(pathto)
                    makedirs_ifneeded(head)
                    copyfile(pathfrom, pathto)
                except:
                    error('saving new FROM file: skipped', pathfrom)
                else:
                    countresolve.files.created += 1
                    trace(1, 'saved new FROM file,', pathfrom)

            elif os.path.isdir(FWP(pathfrom)):
                try:
                    head, tail = split(pathto)
                    makedirs_ifneeded(head)
                    os.mkdir(FWP(pathto))
                    copytree(pathfrom, pathto, skipcruft=cmdargs_skipcruft)
                except:
                    error('saving new FROM dir: skipped', pathfrom)
                else:
                    countresolve.folders.created += 1
                    trace(1, 'saved new FROM dir, ', pathfrom)

            else: trace(1, 'ignored unknown unique type in FROM:', pathfrom)


    #-------------------------------------------------------------------------
    # 3) Uniqueto -> removed from FROM => note TO in deltas __added__.txt
    #
    # - Will be deleted from TO when deltas applied... as if was added
    # - These are any type of item deleted from FROM since the latest sync
    # - These may also be renames, along with a new entry in uniquefrom
    # - These will still be present in TO on later mergeall.py -restore run
    # - [3.3] nameto is in the TO tree, but may differ again on apply
    #-------------------------------------------------------------------------

    addedpath = join(DIRDELTA, '__added__.txt')
    deleteds  = unicode_open(addedpath, mode='w', encoding=ADDENC)

    for (uniqs, dirfrom, dirto) in uniques['to']:      # dirfrom unused here
        for nameto in uniqs:
            pathto = join(dirto, nameto)

            # note relative to TO root for -restores
            archtail = pathto[(len(cmdargs_dirto) + len(os.sep)):]

            if os.path.isfile(FWP(pathto)) or os.path.islink(FWP(pathto)):
                countresolve.files.deleted += 1
                deleteds.write(archtail + unicode_linesep)
                trace(1, indent1 + 'listed removed TO file:', archtail)
 
            elif os.path.isdir(FWP(pathto)):
                countresolve.folders.deleted += 1
                deleteds.write(archtail + unicode_linesep)
                trace(1, indent1 + 'listed removed TO dir:', archtail)

            # or bogus message line in deleteds to defer to -restore?
            else: trace(1, 'ignored unknown unique type in TO:', pathto)

    deleteds.close()
    if uniques['to']:
        # sans unknowns, at least
        numto = sum(len(uniqs) for (uniqs, dirfrom, dirto) in uniques['to'])
        trace(1, indent1 + 'listed %d TO item(s) in deltas __added__.txt' % numto)


    #-------------------------------------------------------------------------
    # 4) Mixes -> changed type in FROM => copy FROM's version to deltas
    #
    # - Will overwrite version in TO when deltas applied... as if was replaced
    # - Rare, but may occur if a file was changed to a folder, or vice versa
    # - May also be symlink-vs-stubfile, though moot unless modtimes differ
    # - Deltas~TO will still be mixed types on later mergeall.py -restore run
    # - Caveat: this code is identical to case #2 above, sans top-level loops
    #   and messages text, but retained to make this case explicit/standalone
    # - [3.3] namefrom/nameto may differ here, but may differ again on apply
    #-------------------------------------------------------------------------

    for (namefrom, nameto, dirfrom, dirto) in mixes:
        pathfrom, pathto = join(dirfrom, namefrom), join(dirto, nameto)
        pathto = pathto.replace(cmdargs_dirto, DIRDELTA)
        
        if os.path.isfile(FWP(pathfrom)) or os.path.islink(FWP(pathfrom)):
            try:
                head, tail = split(pathto)
                makedirs_ifneeded(head)
                copyfile(pathfrom, pathto)
            except:
                error('saving mixed FROM file: skipped', pathfrom)
            else:
                countresolve.files.created += 1
                trace(1, 'saved mixed FROM file,', pathfrom)

        elif os.path.isdir(FWP(pathfrom)):
            try:
                head, tail = split(pathto)
                makedirs_ifneeded(head)
                os.mkdir(FWP(pathto))
                copytree(pathfrom, pathto, skipcruft=cmdargs_skipcruft)
            except:
                error('saving mixed FROM dir: skipped', pathfrom)
            else:
                countresolve.folders.created += 1
                trace(1, 'saved mixed FROM dir, ', pathfrom)

        else: trace(1, 'ignored unknown mixed type in FROM:', pathfrom)




##################################################################################
# UTILITIES
##################################################################################




def getargs():
    """
    ---------------------------------------------------------------------------
    Get command-line arguments, return False if any are invalid.

    #DELTAS: commands totally differ from mergeall.py - use a custom function
    here that replaces the original version imported from mergeall.py.

    Expect and use an extra first argument: the path name to the folder where 
    the deltas should be stored (and auto clean/make this folder if needed).
    Also trim mergeall.py args that apply to a resolution phase not run here.

    [3.2] This now drops trailing / or \ on folder args, if any.  Else, they
    wreak havoc with later path .replace() calls and archtail length calcs.

    [3.3] Add -quiet to omit Unicode normalization messages during comparison
    phase (there may be many).  -quiet is also used for resolution backups,
    and __added__.txt removals when -restore mode morphs paths to match TO. 

    [3.3] Support Windows long paths on machines that need it by using FWP() 
    everywhere here (e.g., an existing DELTAS folder may be arbitrarily deep).
    backup.rmtreeworkaround() handles some errors, but does not FWP() itself.
    ---------------------------------------------------------------------------
    """

    def usageerror(message):
        """
        Display usage, show script's docs?
        """
        print('**%s' % message)
        print('deltas run cancelled.')
        print('Usage:\n'
                   '\t[py[thon]] deltas.py dirdeltas dirfrom dirto\n'
                   '\t\t[-report]\n'
                   '\t\t[-peek]\n'
                   '\t\t[-quiet]\n'
                   '\t\t[-skipcruft]')
        
        if sys.stdin.isatty() and sys.stdout.isatty():
            if input('More?') in ['y', 'yes']:           # [2.0] for shell, not pipe
                try:
                    help('deltas')                       # never used by launchers
                except NameError:                        # and absent in frozen exe [3.3]
                    print('help unavailable in this package')
                    

    def initdeltasfolder(dirdelta):
        """
        Clean+create deltas folder, but not if -report.
        End script now (before comparison phase) on any fail here.
        """
        try:
            if '-report' not in sys.argv:                # don't make folder if -report
                if os.path.exists(FWP(dirdelta)):        # allow Windows longpaths [3.3]

                    # rm existing deltas
                    if os.path.isfile(FWP(dirdelta)):
                        os.remove(FWP(dirdelta))         # end script now on file errors
                    else:
                        shutil.rmtree(FWP(dirdelta, force=True), 
                            onerror=backup.rmtreeworkaround)

                # make new deltas
                os.mkdir(FWP(dirdelta))
                trace(1, 'Saving all deltas to:', dirdelta)
            return True

        except Exception as E:
            return False


    class cmdargs: pass   # a set of attributes
    
    try:
        # required args
        cmdargs.dirdelta = sys.argv[1].rstrip(os.sep)     # [3.2] drop trailing / or \
        cmdargs.dirfrom  = sys.argv[2].rstrip(os.sep)     # else bad len and replace
        cmdargs.dirto    = sys.argv[3].rstrip(os.sep)
    except:
        usageerror('Missing dirdelta, dirfrom, or dirto paths')
        return False
    else:
        if not initdeltasfolder(cmdargs.dirdelta):
            usageerror('Could not resolve deltas-save folder path')
            return False
        if not os.path.isdir(FWP(cmdargs.dirfrom)):
            usageerror('Invalid dirfrom directory path')
            return False
        elif not os.path.isdir(FWP(cmdargs.dirto)):
            usageerror('Invalid dirto directory path')
            return False
        else:
            # optional args
            options = ['-report', '-peek', '-quiet', '-skipcruft']    # fewer switches
            for option in options:
                setattr(cmdargs, option[1:], False)               
            for option in sys.argv[4:]:
                if option in options:
                    setattr(cmdargs, option[1:], True)
                else:
                    usageerror('Bad command-line option: "%s"' % option)
                    return False

    return cmdargs  # this class is True




##################################################################################
# MAIN LOGIC
##################################################################################




if __name__ == '__main__':
    trace(1, 'deltas %.1f starting' % VERSION)    # Mergeall version 

    import time
    gettime = (time.perf_counter if hasattr(time, 'perf_counter') else
              (time.clock if RunningOnWindows else time.time)) 

    # get and verify parameters from command line
    cmdargs = getargs()
    if not cmdargs:
        sys.exit(1)


    #---------------------------------------------------------------------------
    # COMPARISON PHASE: collect differences
    #---------------------------------------------------------------------------
    
    trace(1, '-' * 79, '\n*Collecting tree differences')
    if cmdargs.skipcruft:
        trace(1, 'Skipping system cruft (metadata) files in both FROM and TO')

    diffs   = []                         
    uniques = {'from': [], 'to': []}     # lists/dict changed in-place by walker
    mixes   = []
    starttime = gettime()
    try:
        comparetrees(cmdargs.dirfrom, cmdargs.dirto,       # from/to roots
                     diffs, uniques, mixes,                # noted differences
                     cmdargs.peek,                         # file reads?
                     cmdargs.skipcruft,                    # exclude cruft files [3.0]
                     cmdargs.quiet,                        # omit normalization msgs [3.3]
                     skips=['__bkp__', '__added__.txt'])   # exclude top-level specials [2.0]

    #DELTAS: reworded message for deltas
    except Exception as Why:
        # [3.0] friendlier message on comparison failure exits
        print('**Error during comparison phase\n'
              '...The deltas run was terminated by a folder comparisons error, to\n'
              '...avoid a partial changes set.  No deltas were saved.  Please resolve\n'
              '...the following Python exception before rerunning deltas.py against\n'
              '...the same folders:')
        print(Why.__class__.__name__, Why)
        print('\n...A detailed Python traceback follows:')
        import traceback
        traceback.print_exc()
        sys.exit(1)
    else:
        trace(1, 'Phase runtime:', gettime() - starttime)  # [2.2] time phases

    trace(1, '-' * 79, '\n*Reporting tree differences')
    reportdiffs(diffs, uniques, mixes, dorestore=False)    # handles own exceptions [3.3]
    if cmdargs.report:
        # report and exit
        summaryreport(diffs, uniques, mixes, deltas=True)  # show totals [2.0] [3.3]
        sys.exit(0)


    #---------------------------------------------------------------------------
    # RESOLUTION PHASE: reconcile differences
    #---------------------------------------------------------------------------
    
    trace(1, '-' * 79, '\n*Resolving tree differences')
    if cmdargs.skipcruft:
        trace(1, 'Skipping system cruft (metadata) files in FROM folders')

    starttime = gettime()

    #DELTAS: replace TO updates with deltas saves
    """CUTCUTCUT
    mergetrees(diffs, uniques, mixes,                      # noted differences
               cmdargs.auto,                               # make changes? else ask
               cmdargs.backup,  cmdargs.dirto,             # save items replaced/removed [2.0]
               cmdargs.restore, cmdargs.dirfrom,           # keep unique TO, undo adds [2.1]
               cmdargs.quiet,                              # suppress backing-up messages [2.4]
               cmdargs.skipcruft)                          # skip cruft files in copytree [3.0]
    CUTCUTCUT"""

    savedeltas(diffs, uniques, mixes,                      # noted differences
               cmdargs.dirdelta, cmdargs.dirto,            # deltas and TO folders
               cmdargs.skipcruft)                          # skipcruft in tree copies

    trace(1, 'Phase runtime:', gettime() - starttime)      # [2.2] time phases

    #DELTAS: the verify phase makes no sense here - no TO changes are made
    """CUTCUTCUT
    if cmdargs.verify:
        # post verify step
        trace(1, '-' * 79 + '\n*Diffall run follows\n' + '-' * 79)
        starttime = gettime()
        cmd = os.popen('diffall.py %s %s' % (cmdargs.dirfrom, cmdargs.dirto))
        for line in cmd: print(line, end='')    # or save to a file?
        trace(1, 'Phase runtime:', gettime() - starttime)  # [2.2] time phases
    CUTCUTCUT"""

    # run in mergeall, with its global
    summaryreport(diffs, uniques, mixes, deltas=True)      # show totals [2.0] [3.3]



[Home page] Books Code Blog Python Author Train Find ©M.Lutz