File: genhtml/__docs__/prior-versions/2.1/genhtml--preCamelCase.py
#!/usr/bin/python3
"""
===========================================================================================
genhtml.py - static HTLM inserts
Author and copyright: M. Lutz, 2015
License: provided freely but with no warranties of any kind.
VERSION 2, November 17, 2015: smarter dependency checking
Don't regenerate an HTML file for a changed insert file, unless the HTML's
template actually USES the insert file, or an insert file that inserts it.
Also refactor as functions - at ~250 lines, top-level script code becomes
too scattered to read (and at ~1K lines, class structure is nearly required).
SYNOPSIS
Simple static HTML inserts: given an HTML templates dir and an HTML inserts
dir, generate final HTML files by applying text replacements to the templates,
where replacement keys and text correspond to insert file names and contents.
For static insert content, this script is an alternative to:
- Mass file edits on every common-item change - which can be painfully tedious
- Client-side includes via embedded JavaScript - which not all visitors may run
- Server-side includes via PHP - which makes pages unviewable without a server
This script adds a local admin step, as it must be run on every HTML content
change, but there is no HTML include; <object> doesn't work on all browsers.
COMMAND-LINE USAGE
% gensite.py (or run via icon click)
=> Copy all changed non-HTML files in SOURCEDIR to TARGETDIR (if any)
=> Regenerate all HTML files whose SOURCEDIR template or any inserts it uses
have changed since the HTML's last generation, per dependency tests ahead
% gensite.py [file.htm]+
Same, but apply to just one or more SOURCEDIR files, listed without dir name
USAGE PATTERNS
To use this script to maintain a site's files:
1) Change HTML template files in SOURCEDIR, and/or insert files in INSERTSDIR.
2) Run this script to regenerate all changed HTML files in TARGETDIR as needed.
3) Upload newly-generated files (or all) from the TARGETDIR to the web server.
Do not change HTML files in TARGETDIR: they may be overwritten by generations!
There are two ways to structure a site's files:
1) Keep both HTML templates and all other site files in SOURCEDIR. In this mode,
changed non-HTML files are copied to TARGETDIR when HTML files are regenerated.
2) Use SOURCEDIR for HTML template files only, keep other web site files in TARGETDIR.
This mode avoids copying other non-HTML files to TARGETDIR on HTML regenerations.
Either way, TARGETDIR is always the complete web site, for viewing and uploads.
TEXT REPLACEMENTS
Text replacements are variable, and derived from insert folder content:
- Replacement keys are '$XXX$' for all INSERTDIR/XXX.txt filenames.
- Replacement values are the contents of the INSERTDIR/XXX.txt files.
- Algorithm:
For each changed '*.htm' and '*.html' (caseless) HTML template in SOURCEDIR:
For each XXX in INSERTDIR/XXX.txt:
Replace all '$XXX$' in HTML template with the content of file XXX.txt
Save result in TARGETDIR
Other changed non-HTML files (if any) are copied to TARGETDIR verbatim.
To automate changing DATES in both HTML files and insert files, the script also
replaces special non-file '$_DATE*$' keys: e.g., '$_DATELONG$' => 'November 6, 2015'.
See the script's code ahead for the full set of date keys available.
Example key=>file replacements (with possible nested inserts, described ahead):
Coded in <HEAD>
$STYLE$ => INSERTDIR/STYLE.html (a <style> or <link rel...>)
$SCRIPT$ => INSERTDIR/SCRIPT.html (analytics or other JS code block)
$ICON$ => INSERTDIR/ICON.html (site-specific icon link spec)
Coded in <BODY>
$FOOTER$ => INSERTDIR/FOOTER.html (a standard sites links toolbar)
$HEADER$ => INSERTDIR/HEADER.html (a standard header block)
$TOTOC$ => INSERTDIR/TOTOC.html (a standard go-to-index button line)
See also "__docs__/template-pattern.html: for a skeleton use case example
file, and the "Html-templates" test folder for additional template examples.
NESTED INSERTS
To allow insert files to be built up from other insert files (in an intentionally
limited fashion), the script also replaces any '$XXX$' keys in the loaded text of
insert files, before regenerating any HTML files. For an example use case, see
FOOTER-COMMON.txt and its clients in the Html-inserts folder; it is inserted into
other footer insert files which have varying footer parts. For dependency checking,
any newer modtimes of nested inserts are also propagated to their inserter (see ahead).
Limitation: by design, nesting is only 1-level deep - an HTML template may insert an
insert file which inserts other insert files, but no more (this is not recursive!).
AUTOMATIC DEPENDENCY CHANGE DETECTION
This script acts like a makefile: changing files suffices to trigger regeneration
on the next run. In more detail, an HTML file is automatically regenerated if:
(a) Its HTML template file has been changed since the HTML's last generation; or
(b) Its HTML template file inserts any file that has been changed since the
HTML's last generation; or
(c) Its HTML template file inserts any file that inserts any other file which
has been changed since the HTML's last generation (nested inserts).
In other words, the script generates expanded HTML files for all HTML templates
that have no expansion yet; are newer than their expanded versions; or use any
insert file that is newer than their expanded versions. Conversely, an HTML
file is not regenerated if neither the HTML template nor any used insert is
newer than the expansion target.
FORCEREGEN: To force a regeneration of all HTML files, open and save any insert
file used by every template (if any), or set the FORCEREGEN variable below.
When True, this switch effectively ignores dependency tests and generates all.
DATE INSERTS: By design, there is no dependency checking for non-file '$_DATE*$'
inserts (else each date-key client would be regenerated every day!). Open and
save date client files to force their regen with updated dates when appropriate.
NESTED INSERTS: When nested inserts are used, dependencies are transitive: the
modtime of any insert file is considered to be the greater of the modtime of the
insert file itself, or the modtimes of any insert files which the insert file uses.
Modtimes are propagated to nested inserters before testing template dependencies.
MISCELLANEOUS NOTES
- Skips replacement targets not present in HTML text (replace() is a no-op).
- Skips replacement targets having no INSERTDIR file (no .replace() is run).
- Tries multiple Unicode encodings for HTML text: expand the set as needed.
- Assumes insert files are all in same Unicode encoding: change as needed.
- Changed external CSS files must be uploaded, but do not require or trigger
HTML regeneration here (unlike changed CSS <link>s or inline CSS code inserts).
- See "Programming Python, 4th Edition" for automated FTP site upload scripts.
- File modtimes are simply floats, giving seconds since the "epoch":
>>> import os, time
>>> t1 = os.path.getmtime('.')
>>> t1
1446572919.58063
>>> time.ctime(t1)
'Tue Nov 3 09:48:39 2015'
>>> t1 += 1
>>> time.ctime(t1)
'Tue Nov 3 09:48:40 2015'
TBDs:
- Nested inserts could be allowed to be arbitrarily deep, rather than limiting
them to just one level. This entails substantial change and extra complexity,
though, which has not yet been justified by the author's use cases.
- Subdirectories are not directly supported, though they can be maintained
as separately-generated and uploaded working folders.
- Automatic "<!-- -->" comment wrappers could be emitted, but they may be
invalid for shorter text inserted into other lines (versus text blocks).
- It might be useful to parametize inserts in some fashion. For instance,
between '@' delimiters, allow a script name and arguments defining a
command line whose stdout gives the insert text. This is an order of
magnitude more complex, though, and is not warranted by any use case so far.
===========================================================================================
"""
import os, sys, shutil, time
trace = lambda *args: None # set to print to see more output
# user settings
INSERTDIR = 'Html-inserts' # insert text, filename gives key: XXX.txt.
SOURCEDIR = 'Html-templates' # load html templates (and others?) from here
TARGETDIR = 'Complete' # save expanded html files (and others?) to here
CLEANTARGET = False # True = empty all files in TARGETDIR first
FORCEREGEN = False # True = regenerate all HTML files, ignoring dependencies
# customize Unicode endings if needed
TemplateEncodings = ('ascii', 'utf8', 'latin1', 'utf16') # try each, in turn
InsertsEncoding = 'utf8' # use for all inserts
def loadinserts():
"""
load insert files/keys and modtimes
"""
inserts, insmodtimes = {}, []
for insertfilename in os.listdir(INSERTDIR):
insertkey = '$' + insertfilename[:-4] + '$' # key='$XXX$' from 'XXX.txt'
try:
path = os.path.join(INSERTDIR, insertfilename) # load insert text for key
file = open(path, encoding=InsertsEncoding) # platform default or custom
text = file.read()
file.close() # close for non-CPython
inserts[insertkey] = text
insertmodtime = os.path.getmtime(path) # modtime for changes test
insmodtimes.append([insertkey, insertmodtime]) # add file-based inserts only
except:
inserts[insertkey] = '' # empty if file error
return inserts, insmodtimes
def adddateinserts(inserts):
"""
add special non-file replacement keys (evolve me)
"""
inserts['$_DATELONG$'] = time.strftime('%B %d, %Y') # November 06, 2015
inserts['$_DATESHORT$'] = time.strftime('%b-%d-%Y') # Nov-06-2015
inserts['$_DATENUM$'] = time.strftime('%m/%d/%Y') # 11/06/2015
inserts['$_DATETIME$'] = time.asctime() # Fri Nov 6 10:44:58 2015
inserts['$_DATEYEAR$'] = time.strftime('%Y') # 2015
def propagatemodtimes(insmodtimes):
"""
for nested inserts, the modtime of an insert file is considered
to be the greater of that of the file itself and that of any file
it inserts; propagate newer modtimes from nested inserts to their
clients before expanding nested inserts or HTML templates;
"""
for pair1 in insmodtimes: # pairs are mutable lists
(key1, modtime1) = pair1
greater = modtime1
for (key2, modtime2) in insmodtimes:
if (key2 in inserts[key1]) and (modtime2 > modtime1):
greater = modtime2
pair1[1] = greater
def expandinserts(inserts):
"""
globally replace any keys in loaded insert-file text
"""
for key1 in inserts: # for all insert texts
text = inserts[key1]
for key2 in inserts: # for all insert keys
text = text.replace(key2, inserts[key2]) # no-op if no match
inserts[key1] = text # inserts changed in-place
def sourcenewer(pathfrom, pathto, allowance=2):
"""
was pathfrom changed since pathto was generated?
2 seconds granularity needed for FAT32: see mergeall;
this and its follower assume both file paths exist;
"""
fromtime = os.path.getmtime(pathfrom)
totime = os.path.getmtime(pathto)
return fromtime > (totime + allowance)
def insertsnewer(textfrom, pathto, insmodtimes, allowance=2):
"""
was any used insert file changed since pathto was generated?
2 seconds granularity needed for FAT32: see mergeall;
this could use any() and generators...but should it?
"""
# for all insert files, check if newer and used
totime = os.path.getmtime(pathto)
for (inskey, instime) in insmodtimes:
if instime > (totime + allowance) and inskey in textfrom:
return True
return False
def loadtemplate(pathfrom):
"""
try to load an HTML template file, using various Unicode types,
from simpler to more complex; return tuple of text + encoding;
"""
for encoding in TemplateEncodings:
try:
file = open(pathfrom, mode='r', encoding=encoding)
text = file.read()
return (text, encoding) # success: return now
except:
trace(encoding, 'invalid, ', sys.exc_info()[0])
return (None, None) # no encoding worked
def generatehtmls(filestoprocess, inserts, insmodtimes):
"""
generate expanded HTML files for all HTML templates that have no
expansion yet, are newer than their expanded versions, or use any
insert file that is newer than their expanded versions;
"""
global numcnv, numcpy, numskip, numfail
for filename in filestoprocess:
print('=>', filename, end=': ')
pathfrom = os.path.join(SOURCEDIR, filename)
pathto = os.path.join(TARGETDIR, filename)
if not os.path.isfile(pathfrom):
# skip any subdirs, etc.
print('non-file, skipped')
numskip += 1
elif not filename.lower().endswith(('.htm', '.html')):
#
# non-html file: don't attempt regen, copy to target if changed;
# used when entire site in templates dir, not just templates
#
if os.path.exists(pathto) and not sourcenewer(pathfrom, pathto):
# source file unchanged, don't copy over
print('unchanged, skipped')
numskip += 1
else:
# copy in binary mode unchanged
rawbytes = open(pathfrom, mode='rb').read()
file = open(pathto, mode='wb')
file.write(rawbytes)
file.close() # close for non-CPython
shutil.copystat(pathfrom, pathto) # copy modtime over too
print('COPIED unchanged')
numcpy += 1
else:
#
# html file: regen to target if html or used inserts changed;
# whether templates dir is entire site, or templates only
#
(text, encoding) = loadtemplate(pathfrom)
if text == None:
print('**FAILED**')
numfail += 1
elif (os.path.exists(pathto) # target generated
and not (
sourcenewer(pathfrom, pathto) or # source not changed
insertsnewer(text, pathto, insmodtimes) or # no used insert changed
FORCEREGEN # not forcing full regen
)):
# neither html template nor any used insert newer than target
print('unchanged, skipped')
numskip += 1
else:
# globally replace keys in text and copy over;
# no copystat(): insertsnewer() needs new modtime
for key in inserts: # for all filename keys
text = text.replace(key, inserts[key]) # no-op if no match
file = open(pathto, mode='w', encoding=encoding) # encoding=source's
file.write(text)
file.close() # close for non-CPython
print('GENERATED, using', encoding)
numcnv += 1
if __name__ == '__main__':
"""
top-level code
"""
numcnv = numcpy = numskip = numfail = 0 # globals all
# empty target dir?
if CLEANTARGET:
for filename in os.listdir(TARGETDIR):
os.remove(os.path.join(TARGETDIR, filename))
print('--Target dir cleaned')
# load/add inserts text
inserts, insmodtimes = loadinserts()
adddateinserts(inserts)
print('Will replace all:',
*(key for key in sorted(inserts)), sep='\n\t', end='\n\n')
# copy newer modtimes of nested inserts to clients
propagatemodtimes(insmodtimes)
# expand nested inserts replacements first
expandinserts(inserts)
# check run mode
if len(sys.argv) == 1:
filestoprocess = os.listdir(SOURCEDIR) # all files in source dir
else:
filestoprocess = sys.argv[1:] # or just filename(s) in args
# expand and copy templates, copy others
generatehtmls(filestoprocess, inserts, insmodtimes)
# wrap up
summary = '\nDone: %d converted, %d copied, %d skipped, %d failed.'
print(summary % (numcnv, numcpy, numskip, numfail))
if sys.platform.startswith('win'):
input('Press enter to close.') # retain shell if clicked