File: tagpix/_drop-redundant-dates.py
#!/usr/bin/env python3 """ ========================================================================== drop-redundant-dates.py: a tagpix utility script. Summary: delete redundant dates in filenames of photos shot on Android. This Python 3.X script has the same copyright and license as tagpix.py. ---- UPDATE: as of tagpix 2.2, this script is now an optional, one-time tool. It need only be run by users of tagpix 2.1 and earlier, having photos shot on Android devices in merged folders with redundant filename dates post tagpix expansion (e.g.,'2018-02-05__20180205_154910.jpg'). tagpix 2.2 and later automatically remove such redundant dates when renaming files (e.g., '2018-02-05__20180205_154910.jpg' => '2018-02-05__154910.jpg') by default. Hence, among tagpix users with merged photos shot on Android who wish to avoid redundant dates in image filenames, this script: - Never has to be run by users of only versions >= 2.2 - Must be run only once for users of <= 2.1 who've upgraded to >= 2.2 - Can be run on demand after tagpix.py runs if you're using <= 2.1 The documentation that follows applies if your category implies usage. Note that the automatic flavor of date dropping in tagpix.py is immune to special cases #1 and #2 below, because it renames before duplicates are detected and handled, and before its results can be processed by other tools; it addresses #3 with an additional switch in user_configs.py, though its use of filename dates makes differing dates much less likely. CAUTION: it's not impossible that the tagpix and Android dates in a merged photo's filename may differ, and stripping the Android date blindly with this script may discard information. The two dates may differ because prior tagpix versions labeled photos with modification date when no Exif date was present, ignoring the Android filename date. More subtly, the dates may differ as a result of user edits to either Exif tags or filenames. To avoid unwanted renames, run this script in list-only mode to preview changes and isolate any mixed-date cases. To retain both dates if they differ, set 'alwaysdropdate' below to False before running in updates mode. See special case #3 below for more. ---- HOW TO RUN Give two command-line arguments - the path to the root of the folder to be processed, and an optional second argument (of any value) to enable list-only mode (which prints matching filenames without renaming them): ~$ python3 /.../drop-redundant-dates.py <folder> <listonly>? For instance, opening and running this script in the PyEdit IDE with the following command-line arguments string prints all renamable files in the tree rooted at '/MY-STUFF/Camera' without renaming (drop '-' to rename): /MY-STUFF/Camera - This script may be run on any folder. Because it detects and renames only files in that tree already expanded to include tagpix dates, though, it's best run on your merged folder, after running the main tagpix.py script (unmerged files won't be detected or renamed by this script). Normal output lines display as follows (error messages begin with "**"): Renaming: "from filename" => "to filename" in [folder path] ---- WHAT IT DOES This script renames all the files in an entire folder tree that have a redundant date in their names, to remove the extra dates. The extra dates are created by some source devices (notably, Android smartphones) that don't use the 'DSC' or 'IMG' naming conventions or standards designed in part to accommodate the FAT filesystem's 8.3 limits. For example, tagpix.py currently expands such filenames as follows, when moving or copying to your MERGED folders: '20180205_154910.jpg' => '2018-02-05__20180205_154910.jpg' And this script shrinks them as follows (generally in your MERGED folders only, as that is where the duplicate dates will be): '2018-02-05__20180205_154910.jpg' => '2018-02-05__154910.jpg' This leaves the date added by tagpix in the filename for uniformity, and drops the date added by the source device - which is redundant with the goal of tagpix file renaming, but is used by only a few devices (other names are left intact), and does not address other tagpix goals (e.g., merging folders and handling duplicates). Run this on your tagpix merged tree root if filenames have been added to it with the redundant dates. tagpix itself may eventually drop the extra dates automatically when renaming and moving/copying items; if it does, this script will not be required for new items added to the merged result, and will normally be a one-time step. [See the UPDATE above: tagpix 2.2 did adopt automatic renaming.] ---- A FEW SPECIAL CASES 1) Duplicates potential If the renamed target already exists in the tree, the source file is skipped with a message. This is unlikely for files already processed by tagpix (which skips duplicate names with duplicate content, and adds a unique sequence number to duplicate names with differing content), but might arise after user manual renames. 2) Impact on other tools If you're using a tool like thumbspage that relies on the former names of renamed images, you may need to rerun the tool after this script. For thumbspage, this script renames viewer pages too (e.g., 2018-06-21__20180621_170029.jpg.html => 2018-06-21__170029.jpg.html), but such files' content will reference previous source-image names. By contrast, PyPhoto is clever enough to update its thumbnails cache file automatically (via a delete+add) for all filenames changed. See learning-python.com/programs.html for thumbspage and PyPhoto. 3) Differing dates The 'alwaysdropdate' switch below is subtle. Though rare, it's possible that the filename dates added by Android and tagpix may differ for a given image. In one case, the Android date differed from the tagpix date, because the image had been modified by a tool that discarded the image's Exif creation tag (apparently - this is the norm for some Preview edits on some Mac OS). Without such tags, the best tagpix can do is file modification date, which will differ from the original creation date added to the filename by Android (e.g., in the case observed: 2018-08-03__20180408_073757.jpg). Hence: turning the 'alwaysdropdate' switch on (True) strips the Android date unconditionally; turning it off (False) prevents a rename unless the tagpix and Android dates match, and thus retains the Android date when the two differ. It's not clear which policy is best; the Android date is extra info, but doesn't quite apply to a modified image, and makes filenames longer. If in doubt, use True to rename in all cases (e.g., when used for the case observed: 2018-08-03__20180408_073757.jpg => 2018-08-03__073757.jpg). Note that PyPhoto doesn't drop Exif tags of images viewed, but other tools may - including edits in Preview on Mac OS (sometimes), and auto-rotations in thumbspage (at learning-python.com/programs.html). UPDATE 2020: thumbspage now retains+updates Exif tags on rotations; see learning-python.com/thumbspage/UserGuide.html#rotateskeepexifs ========================================================================== """ import re, os, sys rootdir = sys.argv[1] # arg1 is root folder path listonly = len(sys.argv) > 2 # any arg2 = listonly mode print('Running on', rootdir, end=', ') print('listonly' if listonly else 'renaming', 'mode') alwaysdropdate = True # iff True, drop camera date if != tagpix date testname = '2018-02-05__20180205_154910.jpg' frompatt = re.compile('(\d{4}-\d{2}-\d{2})__(\d{8})_\d{6}\..*') nummatches = numrenames = 0 for (dirhere, subshere, fileshere) in os.walk(rootdir): # for all dirs in tree for filename in fileshere: # for all files in dir matched = frompatt.match(filename) # two dates at front? if matched != None: tagpixdate = matched.group(1) sourcedate = matched.group(2) if ((alwaysdropdate) or # even if dates differ? (tagpixdate.replace('-', '') == sourcedate)): # same/redundant dates? nummatches += 1 newname = filename[0:12] + filename[21:] # drop 2nd/redundant date newpath = os.path.join(dirhere, newname) oldpath = os.path.join(dirhere, filename) print('Renaming: "%s" => "%s" in [%s]' % (filename, newname, dirhere)) if os.path.exists(newpath): print('**DUPLICATE: rename target already exists, file skipped') elif not listonly: try: os.rename(oldpath, newpath) numrenames += 1 except Exception as why: print('**FAILED: rename failed, continuing with other files') print('Python exception:', why) print('Number matches, renames: %d, %d' % (nummatches, numrenames))