File: shrinkpix/restore-unshrunk-images.py

#!/usr/bin/env python3
"""
======================================================================================
restore-unshrunk-images.py - restore a folder tree to its pre-shrink state.

This is a utility script for shrinkpix.py, and shares the same date, author,
license, etc., as that script.  See https://learning-python.com/shrinkpix/.

--------------------------------------------------------------------------------------
PURPOSE
--------------------------------------------------------------------------------------

  Run this script to undo all of shrinkpix's changes in a folder tree. 
  It puts back all of the original unshrunk images that shrinkpix saved to 
  a folder tree's backup subfolders.  It also removes the backup subfolders,
  along with any "__N" extra image copies in them.  The net effect fully
  restores the entire tree to its pre-shrinkpix content and structure.

  This may be useful if shrinkpix's results are subpar for your website,
  or if its results are ever improved in future releases.  You can also 
  restore from collect-unshrunk-images.py's results; see the note below.

--------------------------------------------------------------------------------------
USAGE
--------------------------------------------------------------------------------------

  Command line: 
      $ python3 <script> <folderpath> -listonly? -toplevel?

  Run this script with one required argument: the pathname of the folder tree 
  whose images are to be restored.  If the optional -listonly is given, it 
  works the same as setting LISTONLY=True in code below, showing all files to 
  be restored but not restoring them.  If the optional -toplevel is given, it 
  works the same as setting TOPLEVEL=True, limiting the restore to the top 
  level of the <folderpath> folder and ignoring any subfolders.

  If -listonly or -toplevel are not given, they default to this script's 
  LISTONLY and TOPLEVEL settings, respectively.  This program's restoration
  results were proved correct with the diffall.py utility (available in the
  Mergeall package, at learning-python.com/mergeall.html).

--------------------------------------------------------------------------------------
NOTES
--------------------------------------------------------------------------------------

Toplevel restores:
  If you used -toplevel for a folder in the shrink script, you probably want
  to use it for the same folder here too.  Here, it prevents the restore from 
  restoring any originals backed up in nested subfolders below the folder given.
  If those subfolders are managed separately, they should in most cases not be 
  restored along with their ancestor.  In -toplevel mode, this script isn't 
  much more than a manual move + delete: it simply moves all the originals 
  in one backup folder up to their parent folder, and removes the backup folder.
  You could do this manually, but the option serves to call out the use case.

Handling backup duplicates:
  As shipped, this script assumes that backed-up originals without "__N" 
  names are the true originals, and restores just these; any "__N" copies
  in backup folders are deleted.  This makes sense if you've shrunk a given
  image more than once (you want to save the first original, not its reshrunk
  derivatives), but may not apply if you've shrunk different originals of 
  the same name without collecting or restoring backups between shrinks.  

  If the latter is your use case, set DROPDUPS below to False to also move 
  all "__N" copies to the originals folder (instead of deleting them), and
  resolve your duplicates dilemma manually after the restore.

    UPDATE: in 1.3, DROPDUPS is now preset to False, because the duplicates
    naming pattern "__N" unfortunately may match genuine image names.  In 
    particular, the tagpix program (at learning-python.com/programs) creates
    image filenames of the form "2020-09-17__180534.jpg" for some sources, 
    which fool this script into thinking they are shrinkpix duplicates; with
    DROPDUPS True, they would be silently deleted instead of restored (!).  
    Barring a better solution, set DROPDUPS as appropriate for your usage;
    as preset, you may need to manually remove restored but unwanted image
    duplicates after a run (look for "__N" names in the originals folder).

Restoring from collection trees
  This script can also be used to restore originals from backup folders pulled
  by the collector script; see the collector script's "Restore tip" for details
  on this process.  In this mode, the first command-line argument here is the 
  collection folder; the effect is to collapse and remove backup-folder levels; 
  and a later merge (e.g., with rsync) puts originals back in the source tree. 
 
  This script does not have a more direct "backup-from" option, because that 
  seems overly complex, and potentially dangerous (the source tree may be damaged
  if its structure has changed).  The restore+rsync is the suggested alternative. 
  See shrinkpix.py's CAVEATS->Design for more notes on this.

Miscellaneous notes:
  - Removing backup folders may fail for a variety of reasons (e.g., permissions);
    look for "***Could not remove" in script output if its "lingering folders" > 0.

  - Assumes BACKUPSDIR matches that used in the shrinker script at shrink time;
    don't change this setting in the main script between a shrink and restore.

  - Cleanup assumes BACKUPSDIRs have no nested subdirs, and no content except 
    original images; these subfolders are reserved for shrinkpix backups only.

  - This uses re pattern matching to detect numbered copies, but could use splits:
      head = os.path.splitext(item)[0]
      copyn =  ('__' in head) and head.split('__')[-1].isdigit()

  - Moves here avoid ".." in relative paths: it's the parent of this script's dir!

======================================================================================
"""

import os, re, sys, mimetypes
from shrinkpix import BACKUPSDIR          # assumed same as at tree's shrink time      
from shrinkpix import ALLBACKUPSDIR       # pulled backups made by collector script
from shrinkpix import isimage             # mime type filename checker
from shrinkpix import askyesno            # [1.3] don't print traceback on ctrl+c
trace = print


#=====================================================================================
# Configure
#=====================================================================================

LISTONLY = False    # True=show images to be restored, but don't move them
LISTMORE = True     # True=show nested backup source paths too
DROPDUPS = False    # True=discard (don't restore) any "__N" copies in backups ([1.3]=False)
TOPLEVEL = False    # True=limit restore to folder top-level, skipping any subfolders


#=====================================================================================
# Setup
#=====================================================================================

# cmd args now same as shrinker's
root = None
command = '<script> <folderpath> -listonly? -toplevel?'
confirm = 'This script restores all original images in the folder tree; proceed?'

if '-listonly' in sys.argv:              # don't make changes
    LISTONLY = True                      # else use setting's value
    sys.argv.remove('-listonly')

if '-toplevel' in sys.argv:              # skip subdirs 
    TOPLEVEL = True                      # else use setting's value
    sys.argv.remove('-toplevel')

if len(sys.argv) > 1:                    # required folder name
    root = sys.argv.pop(1)               # any position, last remaining 

if not root or not os.path.isdir(root) or len(sys.argv) > 1:
    print('Usage:', command)             # no folder, not a folder, or extra args?
    sys.exit()                           # minimize nesting

if (not LISTONLY) and askyesno(confirm).lower() not in ['y', 'yes']:
    print('Run cancelled.') 
    sys.exit()


#=====================================================================================
# Restore
#=====================================================================================

# walk the subject tree
numoriginals = numrestored = numlingers = 0
for (folder, subs, files) in os.walk(os.path.abspath(root), topdown=True):

    if ALLBACKUPSDIR in subs:
        # don't restore in a collection folder, unless it's topmost root (argv[1])
        subs.remove(ALLBACKUPSDIR)
            
    if BACKUPSDIR in subs:
        # restore from every backup folder reached during the tree walk
        subs.remove(BACKUPSDIR)                               # prune from walk: 
        backupsub = os.path.join(folder, BACKUPSDIR)          # it will be deleted

        for item in os.listdir(backupsub):                    # for all in backup dir
            pathbackup  = os.path.join(backupsub, item)
            pathrestore = os.path.join(folder, item)
            if not isimage(item):
                os.remove(pathbackup)                         # clean up non-images 
            else:
                if (DROPDUPS and                              # discard copies?
                    re.match('.*__[0-9]*\..*', item)):        # i.e., xxxxx__NN.yyy
                    os.remove(pathbackup)                     # clean up any copyn 
                else:
                    numoriginals += 1
                    trace('Restoring %s' % pathrestore)
                    if LISTMORE: 
                        trace(' '*5 + 'from %s' % pathbackup)
                    if not LISTONLY: 
                        numrestored += 1
                        os.replace(pathbackup, pathrestore)   # move+overwrite image
 
        # after os.listdir() loop
        if not LISTONLY:
            try:
                os.rmdir(backupsub)    # clean up if possible
            except Exception as why: 
                numlingers += 1
                print(why)
                print('***Could not remove backup folder.')

    # after backupsdir in subs
    if TOPLEVEL: break    # don't restore in subdirs if they're managed separately

# post-walk wrap-up 
report = 'Finished: %d originals, %d restored, %d lingering folders.' 
print(report % (numoriginals, numrestored, numlingers))



[Home page] Books Code Blog Python Author Train Find ©M.Lutz