File: tagpix/_drop-redundant-dates.py

#!/usr/bin/env python3
"""
==========================================================================
drop-redundant-dates.py: a tagpix utility script.
Summary: delete redundant dates in filenames of photos shot on Android.
This Python 3.X script has the same copyright and license as tagpix.py.

----
UPDATE: as of tagpix 2.2, this script is now an optional, one-time tool.
It need only be run by users of tagpix 2.1 and earlier, having photos 
shot on Android devices in merged folders with redundant filename dates 
post tagpix expansion (e.g.,'2018-02-05__20180205_154910.jpg').
  
tagpix 2.2 and later automatically remove such redundant dates when renaming 
files (e.g., '2018-02-05__20180205_154910.jpg' => '2018-02-05__154910.jpg')
by default.  Hence, among tagpix users with merged photos shot on Android 
who wish to avoid redundant dates in image filenames, this script:

 - Never has to be run by users of only versions >= 2.2
 - Must be run only once for users of <= 2.1 who've upgraded to >= 2.2
 - Can be run on demand after tagpix.py runs if you're using <= 2.1

The documentation that follows applies if your category implies usage.
Note that the automatic flavor of date dropping in tagpix.py is immune
to special cases #1 and #2 below, because it renames before duplicates
are detected and handled, and before its results can be processed by other
tools; it addresses #3 with an additional switch in user_configs.py,
though its use of filename dates makes differing dates much less likely.

CAUTION: it's not impossible that the tagpix and Android dates in a
merged photo's filename may differ, and stripping the Android date 
blindly with this script may discard information.  The two dates may 
differ because prior tagpix versions labeled photos with modification
date when no Exif date was present, ignoring the Android filename date. 
More subtly, the dates may differ as a result of user edits to either
Exif tags or filenames.  To avoid unwanted renames, run this script 
in list-only mode to preview changes and isolate any mixed-date cases.
To retain both dates if they differ, set 'alwaysdropdate' below to False 
before running in updates mode.  See special case #3 below for more.

----
HOW TO RUN

  Give two command-line arguments - the path to the root of the folder to
  be processed, and an optional second argument (of any value) to enable 
  list-only mode (which prints matching filenames without renaming them):

      ~$ python3 /.../drop-redundant-dates.py <folder> <listonly>?

  For instance, opening and running this script in the PyEdit IDE with the 
  following command-line arguments string prints all renamable files in the 
  tree rooted at '/MY-STUFF/Camera' without renaming (drop '-' to rename):

      /MY-STUFF/Camera -

  This script may be run on any folder.  Because it detects and renames 
  only files in that tree already expanded to include tagpix dates, though,
  it's best run on your merged folder, after running the main tagpix.py
  script (unmerged files won't be detected or renamed by this script).  

  Normal output lines display as follows (error messages begin with "**"):

      Renaming: "from filename" => "to filename" in [folder path]

----
WHAT IT DOES

  This script renames all the files in an entire folder tree that have a 
  redundant date in their names, to remove the extra dates.  The extra 
  dates are created by some source devices (notably, Android smartphones) 
  that don't use the 'DSC' or 'IMG' naming conventions or standards 
  designed in part to accommodate the FAT filesystem's 8.3 limits.

  For example, tagpix.py currently expands such filenames as follows,
  when moving or copying to your MERGED folders:

      '20180205_154910.jpg' => '2018-02-05__20180205_154910.jpg'

  And this script shrinks them as follows (generally in your MERGED
  folders only, as that is where the duplicate dates will be):

      '2018-02-05__20180205_154910.jpg' => '2018-02-05__154910.jpg'

  This leaves the date added by tagpix in the filename for uniformity, 
  and drops the date added by the source device - which is redundant 
  with the goal of tagpix file renaming, but is used by only a few 
  devices (other names are left intact), and does not address other 
  tagpix goals (e.g., merging folders and handling duplicates).

  Run this on your tagpix merged tree root if filenames have been
  added to it with the redundant dates.  tagpix itself may eventually
  drop the extra dates automatically when renaming and moving/copying 
  items; if it does, this script will not be required for new items 
  added to the merged result, and will normally be a one-time step.
  [See the UPDATE above: tagpix 2.2 did adopt automatic renaming.]

----
A FEW SPECIAL CASES

  1) Duplicates potential

    If the renamed target already exists in the tree, the source file 
    is skipped with a message.  This is unlikely for files already 
    processed by tagpix (which skips duplicate names with duplicate 
    content, and adds a unique sequence number to duplicate names with 
    differing content), but might arise after user manual renames.

  2) Impact on other tools

    If you're using a tool like thumbspage that relies on the former
    names of renamed images, you may need to rerun the tool after this 
    script.  For thumbspage, this script renames viewer pages too (e.g., 
    2018-06-21__20180621_170029.jpg.html => 2018-06-21__170029.jpg.html),
    but such files' content will reference previous source-image names.
    By contrast, PyPhoto is clever enough to update its thumbnails cache
    file automatically (via a delete+add) for all filenames changed.
    See learning-python.com/programs.html for thumbspage and PyPhoto.

  3) Differing dates

    The 'alwaysdropdate' switch below is subtle.  Though rare, it's 
    possible that the filename dates added by Android and tagpix may 
    differ for a given image.  In one case, the Android date differed 
    from the tagpix date, because the image had been modified by a tool
    that discarded the image's Exif creation tag (apparently - this is 
    the norm for some Preview edits on some Mac OS).  Without such tags,
    the best tagpix can do is file modification date, which will differ 
    from the original creation date added to the filename by Android 
    (e.g., in the case observed: 2018-08-03__20180408_073757.jpg).

    Hence: turning the 'alwaysdropdate' switch on (True) strips the 
    Android date unconditionally; turning it off (False) prevents a 
    rename unless the tagpix and Android dates match, and thus retains 
    the Android date when the two differ.  It's not clear which policy is 
    best; the Android date is extra info, but doesn't quite apply to a 
    modified image, and makes filenames longer.  If in doubt, use True
    to rename in all cases (e.g., when used for the case observed:
    2018-08-03__20180408_073757.jpg => 2018-08-03__073757.jpg). 

    Note that PyPhoto doesn't drop Exif tags of images viewed, but other
    tools may - including edits in Preview on Mac OS (sometimes), and 
    auto-rotations in thumbspage (at learning-python.com/programs.html).
    UPDATE 2020: thumbspage now retains+updates Exif tags on rotations;
    see learning-python.com/thumbspage/UserGuide.html#rotateskeepexifs
==========================================================================
"""

import re, os, sys

rootdir  = sys.argv[1]                  # arg1 is root folder path
listonly = len(sys.argv) > 2            # any arg2 = listonly mode
print('Running on', rootdir, end=', ')
print('listonly' if listonly else 'renaming', 'mode')

alwaysdropdate = True  # iff True, drop camera date if != tagpix date

testname = '2018-02-05__20180205_154910.jpg'

frompatt = re.compile('(\d{4}-\d{2}-\d{2})__(\d{8})_\d{6}\..*')

nummatches = numrenames = 0

for (dirhere, subshere, fileshere) in os.walk(rootdir):         # for all dirs in tree
    for filename in fileshere:                                  # for all files in dir
        matched = frompatt.match(filename)                      # two dates at front?
        if matched != None:
            tagpixdate = matched.group(1)
            sourcedate = matched.group(2)

            if ((alwaysdropdate) or                             # even if dates differ?
                (tagpixdate.replace('-', '') == sourcedate)):   # same/redundant dates?
                nummatches += 1
                newname = filename[0:12] + filename[21:]        # drop 2nd/redundant date
                newpath = os.path.join(dirhere, newname)
                oldpath = os.path.join(dirhere, filename)
                print('Renaming: "%s" => "%s" in [%s]' % (filename, newname, dirhere))

                if os.path.exists(newpath):
                    print('**DUPLICATE: rename target already exists, file skipped')

                elif not listonly:
                    try:
                        os.rename(oldpath, newpath)
                        numrenames += 1 
                    except Exception as why:
                        print('**FAILED: rename failed, continuing with other files')
                        print('Python exception:', why)

print('Number matches, renames: %d, %d' % (nummatches, numrenames))



[Home page] Books Code Blog Python Author Train Find ©M.Lutz