[PP4E cover]

PP4E: Updates Page


Last revised: July 2018

This page collects notes, updates, and a few selected errata related to the book Programming Python, 4th Edition (PP4E). It does not list every erratum reported over the years, because that's now the role of the publisher's errata page, described ahead. Instead, this page assumes this book's more-advanced audience will be tolerant of a few inevitable typos in a 1600-page technical book, and focuses on providing supplemental content for readers.

Related Resources

For more book resources, be sure to also see the following external pages, some of which are newer than this page, and continue its mission:

Content Here

The following lists summarize this page's content for this edition of the book. If this page's posts ever had a specific sequential ordering, it has been long lost to the ravages of time and edits (alas, this page has often had to make do with scraps from the attention table). Today, items here may be better accessed randomly using the topic groupings of the content lists below.

Note that only the last section below is true errata (book corrections), patched in later reprints. The other items here form an informal book "blog" of sorts. If you're looking for a complete corrections list, or find a new issue you wish to report, please see the publisher's errata page for this book; I'm automatically emailed posts made at that page. In recent years, that errata page has also grown to host a few answers to reader questions and general book notes not duplicated here, and should be considered an extension to this page.

General Book Notes

Example-Specific Notes

Python Changes Since Publication

Supplemental Examples

Book Corrections



More Bonus Examples: Folders Sync, Calendar GUI

[May-5-14] New book-related example: Mergeall consists of a script and GUI that synchronize directory trees, and can provide both an incremental updates tool and a manual alternative to cloud storage. Mergeall's main script reuses a number of directory-processing examples that appear in the book's systems programming part. This program's coverage includes code and screenshots, but grew too long for inline treatment here, and was moved off page.

Click here to go to the Mergeall page

As you'll find on that page, Mergeall eventually was ported to Mac OS too, and packaged as both source code and standalone executables for Mac, Windows, and Linux. It's a realistically scaled project that's too large to cover in the book, but suggested as follow-up study for readers.

Update, 2017 For more book-related example code, see also the newer Frigcal calendar GUI example, which showcases Python's tkinter GUI library covered extensively in the book, and the even newer programs page, which leads to upgraded versions of many of the book's major examples—most notably, PyEdit, PyMailGUI, and PyGadgets.

[Back to Index]


Running Book Examples on Linux — and Mac OS

[May-5-14] I've recently begun running book examples on a Linux dual-boot system, under Fedora 20 and Gnome 3 (and later, Ubuntu). This is partly in response to the focus in Windows on clouds, subscriptions, advertising, and devices that are proprietary and seem intentionally crippled, but that's not what this note is about (see the related post).

So far, all the major GUI-based examples work well unchanged, with one minor exception: the script used to launch PyMailGUI after selecting from one of N email accounts contains an unfortunately nonportable and hardcoded Windows path. You may never encounter this; PyMailGUI can be run directly too—and is by the book's demo launchers—and this script is in part coded to work when PYTHONPATH has not been set to include the book's examples root. But to fix the account-selector script so it works on Linux too, in this book examples tree file:

.../PP4E/Internet/Email/PyMailGui/altconfigs/launch_PyMailGui.py
change the 2nd-from-last line from the first of the following to the second, in order to pick up the underlying platform's path separator portably:
os.environ['PYTHONPATH'] = r'..\..\..\..\..'                             # hmm; generalize me
os.environ['PYTHONPATH'] = '..%s..%s..%s..%s..' % ((os.path.sep,) * 4)   # hmm; generalize me
You may also want to change the last line from os.system('PyMailGui.py') to something like os.system('python3 PyMailGui.py') in order to force 3.X execution on Linux for the spawned PyMailGUI, but this depends on your system's links and configuration (Python Windows launcher settings don't apply in any event—you'll need to specialize this line's code per sys.platform if needed). There are undoubtedly other Linux portability issues in smaller book examples, especially those in the Systems section; more here as they surface. See also: tkinter Linux portability notes elsewhere on this page.

Update, 2018 Subsequent to this post, the book's major book examples were also ported to and run successfully on Mac OS. For a sampling of some of the issues this platform poses, both GUI-related and general, see the release notes of:

Each of the above programs is also available in source-code form that illustrates Mac coding. Python and tkinter may be relatively portable, but some platform-specific divergence is unavoidable. The Mac's richer GUI experience—including global toolbars, slide-down dialogs, and app state—implies unique coding requirements.

[Back to Index]


Python Email Package Surrogates Bug in 3.3.3, Fixed in 3.3.4

Short story: due to a temporary regression in Python's email package, you probably should not run the book's PyMailGUI email client on Python 3.3.3. Instead, use any other Python 3.X version—3.1 or 3.2; 3.3.0 through 3.3.2; 3.3.4 or later 3.3; or 3.4.0 or later 3.4.

The Regression in 3.3.3

[Mar-4-14] In 3.3.3 only, Python's email package changed in a way that broke this book's PyMailGUI. The break occurs when replying to or forwarding a message whose main body text contains a non-ASCII character that was encoded per base64 or quoted-printable in the original message. Such email messages worked fine in PyMailGUI from Pythons 3.1 through 3.3.2. In 3.3.3, though, a simple slanted quote or emdash suffices to cause problems; when such characters are present in the body text, PyMailGUI doesn't crash, but the message can't be sent, and the GUI displays an error dialog with text:

Send failed:
<class 'UnicodeEncodeError'>
'utf-8' codec can't encode character '\udce2' in position 688:
surrogates not allowed

This makes no sense, given that surrogates are supposed to be employed in the email package's new bytes API only—an API which PyMailGUI predates, and does not use in any way (PyMailGUI decodes message full-text to str text instead). The 3.3.3 email package must be mutating the already-decoded body text, and inserting surrogate Unicode escapes—something it absolutely should not do, and did not do until the 3.3.3 point release.

Timing: because the error occurs on Send in the Message.set_payload() call following the fix_text_required() workaround for an earlier email issue, a change in character-set output encoding logic is the prime suspect. Before this point, the fetched raw text of mails is correct (double-clicks show its original encoded form), as is the result of mail parsing (View, Reply, and Fwd all display correctly decoded text, including any non-ASCII characters). Both failing cases observed were attempting to encode text per UTF-8 plus base64 on Send. In any event, the next section makes this largely a moot point.

The Repair in 3.3.4

The good news is that this Python regression appears to have been present in just one point release, and was fixed quickly. It has been observed in 3.3.3 only (plus an early 3.4 beta which inherited the issue temporarily). It is not present in 3.3.2, and is fixed as of 3.3.4. Its repair was also propagated to later 3.4 prereleases. Since the latest official 3.3 and 3.4 downloads available at python.org—currently 3.3.4, 3.3.5rc2, and 3.4.0rc2—do not have the problem, its impact should be minimal.

PyMailGUI itself was coded for Pythons 3.1 and 3.2, current at book development time, but is known to work well through 3.4.0, apart from this temporary surrogates issue in 3.3.3. I use this program constantly, but only recently discovered the issue when using a newer 3.3.3 install. It's less than ideal for point releases to break working programs this way, of course, but mistakes happen, and programs like PyMailGUI have to mind the bleeding edge of Python releases more than most; book readers tend to prefer the latest Python either way.

For examples of other PyMailGUI breakages caused by Python email package changes, see the patch for item #3 on this page; it's been a potential source of problems with each new Python release installed. I've also observed some Windows line-break strangeness in recent email package versions (text is sometimes saved as one long line), but this is to be investigated. In the end, this makes for a reasonable lesson in itself: library dependencies are an unavoidable aspect of real-world software development.

Footnote: You can verify the Python version that PyMailGUI is using by clicking Write, entering the following program code in the Write window's main text area, and then clicking its Tools -> Run Code menu option; this is a feature of the PyEdit component, which runs the edited text as program code, and shows its output in the console window where PyMailGUI was launched (don't try that in Outlook...):

import sys
print(sys.version)

Update, 2018 PyMailGUI eventually broke its dependencies on the latest-and-greatest Python release by providing standalone executable packages for each major desktop platform. These "frozen" executables simplify installs and better integrate with platform GUI metaphors. Perhaps more importantly, by bundling specific and verified versions of Python and Tk, they grow immune to future changes in either. The downside of such packages is extra build complexity beyond this update's scope; see PyMailGUI's home page and build folder for more details.

[Back to Index]


How a Monkey Broke PyMailGUI... (Tk Unicode-Limit Patch)

Short story: Tk, the library underlying the tkinter GUI module used in book examples, does not currently support some Unicode characters. If unsupported characters may crop up, programs need to replace these for display to avoid possibly uncaught exceptions. PyMailGUI, PyEdit, and other book-related programs now do.

The Tk Unicode Issue

[Jan-16-14] After using the threaded PyMailGUI on a daily basis for 8 years (more than 3 in its latest 4th Edition form), a new issue cropped up when someone sent an email whose alternative text part contained a Unicode character not supported by the underlying Tk GUI library—character 🙊, which is Unicode codepoint U+1f64a and u'\U0001F64A' in Pythonese, the "Speak-No-Evil Monkey" character (no really; look it up). The Tk GUI system can't handle character codes over 16 bits like this one (technically, outside the "BMP"), and PyMailGUI relies on Tk's rendering prowess to do the right thing for Unicode, as described in the book; see pages 538-548.

As is, PyMailGUI reports the Tk error message in the console window and doesn't crash per se, but the GUI is partly disabled, because this error is raised and uncaught in a thread-exit callback, thus preventing a thread-busy lock from being released... which in turn disables future Loads, Views, Deletes, and Quits (in fact, Task Manager may be required to close the GUI on Windows, and similar elsewhere).

To do better, fetch this updated ViewWindows.py, and copy it into your book example tree's PP4E\Internet\Email\PyMailGUI folder. It simply catches the Tk library exception, displays a popup and stack trace, and continues, so that thread-busy locks are released. Search for "1.5" in the file for more on the changes; the too-large Unicode character also triggers a Tk exception in other places (e.g., viewing the text part later), but these are already caught and reported with popups, and don't impact thread locks.

More PyMailGUI Ideas

From the semi-related-topics department: additional PyMailGUI changes to support POP over SSL, SMTP over SSL/TLS, and POP servers that limit logins by time (thereby perhaps requiring a single persistent login instead of one login per transaction) are in progress, but are also suggested exercise. Accounts on outlook.com are one motivation for some of these mods. The first two (SSL/TLS) are now supported in Python's libs, but not yet in PyMailGUI; see Python manuals for usage details, and this related note on this page.

Update, 2016 For much more on the Tk Unicode limitation, see its later description in the Frigcal docs. That program confronts the same issue, in the context of calendars and events. This is most egregious to people who abuse emojis (hey—you know who you are!).

Update, 2017 The Tk Unicode limitation in PyMailGUI—and the PyEdit component it uses—was eventually addressed more globally in its standalone release available here. To fix, all non-BMP Unicode characters are now replaced with the Unicode replacement character � for display (until Tk supports more of Unicode, including emojis). There's more on the issue at large in standalone PyEdit's user guide; search for its "About emojis" notebox.

Update, 2018 Support for email servers that use or require SSL/TLS also eventually found its way into PyMailGUI's standalone release; see its change log and mailconfig files for details.

[Back to Index]


A Patch for Running PyMailGUI on Python 3.3 (and Later)

[Sep-15-13] Per the description elsewhere on this page, a Python 3.3 standard library change broke some email address displays of non-ASCII names in PyMailGUI, the largest example in the book. In short, the Python 3.3 email package's formataddr utility function now applies a new automatic MIME encoding for names, which it did not in the past—a curious and undocumented incompatible change, which did not account for display-oriented use cases, and can break code that worked well under Pythons 3.0 through 3.2 (including some in this book). Luckily, this is fairly easy to repair.

To apply and use the patch for this Python change, simply fetch the following two files, and copy them to the PP4E\Internet\Email\PyMailGui folder in your book examples tree, per the more detailed instructions in the first of these:

  1. py33patch.py—the patch to import, with self-test and docs
  2. SharedNames.py—replacement for this file with required patch import
The first file updates the email package to be backward compatible for the duration of the PyMailGUI program's run only. The second file, an existing part of PyMailGUI, is simply augmented to import the first. The first file also has additional documentation on the issue and its patch—see its comments for more details.

(And yes, this is a module-level example of what's called "monkey patching" today, though applying a new label to an old technique doesn't necessarily make it any more palatable...)

Update, Oct-15-13 There is a new 1.4 release of the book's examples package which incorporates the small Python 3.3 patch described above. Get the new examples release here, and read about its 3.3 (and later) changes here.

Update, Nov-26-13 It has now been verified that this patch and the 1.4 examples release also suffice to make the book examples described here work under Python 3.4, per its beta releases.

Update, Oct-1-15 It's now also been verified that this patch and the 1.4 examples release suffice to make the book examples mentioned here work under Python 3.5, per its final 3.5.0 release.

Update, 2018 Naturally, this patch was also present in PyMailGUI's later standalone release. This release used Python 3.5 in 2017 and still does; later 3.Xs are not compelling enough to offset revalidation and redistribution costs (despite the PR).

[Back to Index]


Default Port Numbers in Preview's webserver.py

[Aug-22-13] Because it is now officially an FAQ, this post includes the important bits from a dialog with a reader who was having trouble running the web examples in the preview chapter of the book. In short, on some machines you may need to change the hardcoded port number used in this script to something other than "80," and list it in the URL explicitly (and read ahead in the book itself to the full coverage of this subject in later sections).

> > -----Original Message-----
> > To: lutz@rmi.net
> > Subject: Programming Python 4th Ed
> > Date: Wed, 21 Aug 2013 13:49:19 +0100
> > 
> > Dear Sir
> > 
> > Programming Python 4th Edition.
> > I'm stuck on page 53. Example 1-30 runs ok but I can't get example 
> > 1-31 to reply. When I run the html script it lists the contents of 
> > 1-31, How do I get it to execute 1-31?
> > I am using Python 3.3 on Windows 7.
>
>  
> -----Original Message-----
> From: Mark Lutz [mailto:lutz@rmi.net] 
> Sent: 21 August 2013 17:14
> Subject: Re: Programming Python 4th Ed
> 
> There are too many things that can go wrong in the Web realm to offer advice
> based on your email (including but not limited to running the web server
> shown a page or two ahead).  My advice is to read ahead to the server side
> scripting chapter of the book for the full story on the Web/CGI domain.
>
>
> -----Original Message-----
> To: 'Mark Lutz' 
> Subject: RE: Programming Python 4th Ed
> Date: Thu, 22 Aug 2013 11:39:13 +0100
> 
> Hi Mark
> Thanks for your prompt reply. I followed your advice to read further.
> When I run Example 1-32 Pg56 webserver.py I get the following output:-
> 
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license()" for more information.
> >>> ================================ RESTART================================
> >>> 
> Traceback (most recent call last):
>   File "C:\Users\...\Documents\AAAPROJECTS\COMPUTERSCIENCE\PYTHON\PROGPY33\C01\webserver.py", line 15, in 
>     srvrobj  = HTTPServer(srvraddr, CGIHTTPRequestHandler)
>   File "C:\Python33\lib\socketserver.py", line 430, in __init__
>     self.server_bind()
>   File "C:\Python33\lib\http\server.py", line 135, in server_bind
>     socketserver.TCPServer.server_bind(self)
>   File "C:\Python33\lib\socketserver.py", line 441, in server_bind
>     self.socket.bind(self.server_address)
> OSError: [WinError 10013] An attempt was made to access a socket in a way forbidden by its access permissions
> >>>
> 
> Could you please help me with "socket access permissions" as I am new to web
> programming. This is the reason I purchased your book to extend my Python
> Programming into web applications.
> 
> I am running on my own Dell desktop as administrator using Windows 7
> Internet Explorer 10.
> 

The webserver script works fine for me on Python 3.3 and Windows7.
Running the server in a Command Prompt window:

  c:\PP4E\Examples\PP4E\Preview> py -3.3 webserver.py
  127.0.0.1 - - [22/Aug/2013 09:04:08] code 404, message File not found
  127.0.0.1 - - [22/Aug/2013 09:04:08] "GET /favicon.ico HTTP/1.1" 404 -
  127.0.0.1 - - [22/Aug/2013 09:04:17] "POST /cgi-bin/cgi101.py HTTP/1.1" 200 -
  127.0.0.1 - - [22/Aug/2013 09:04:17] command: C:\Python33\python.exe -u 
  c:\PP4E\Examples\PP4E\Preview\cgi-bin\cgi101.py ""
  127.0.0.1 - - [22/Aug/2013 09:04:17] CGI script exited OK

And responding to this URL typed in a web browser window:

  http://localhost/cgi101.html

Probably, you cannot run a server on port #80 (the script's default)
on your machine, because it is locked down by something else (e.g., 
virus software?).  Try changing the port# in the webserver script,
and then name the port# in the URL explicitly: 

  port = 8080   # default http://localhost/, else use http://localhost:xxxx/

  c:\PP4E\Examples\PP4E\Preview> py -3.3 webserver.py

  http://localhost:8080/cgi101.html

Or, pass the port# in on the command line to the expanded version 
of this script that appears later in the book, and run the examples
in that later section's directory: 

  c:\PP4E\Examples\PP4E\Internet\Web> py -3.3 webserver.py . 8080
  webdir ".", port 8080
  ...server log...

  http://localhost:8080/languages.html

This is all explained in detail later in the book in the 
server-side scripting chapter.  It's also mentioned in the 
preview chapter you're reading; quoting from page 57:

"""
One pragmatic note here: you may need administrator privileges 
in order to run a server on the script’s default port 80 on some
platforms: either find out how to run this way or try running on 
a different port.  To run this server on a different port, change
the port number in the script and name it explicitly in the URL 
(e.g., http://localhost:8888/).  We’ll learn more about this 
convention later in this book.
"""

If changing port #s doesn't suffice, I'm afraid there's
nothing more I can offer; server setup is widely variable,
and may require some supplemental exploration.

Best wishes,
--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)

[Back to Index]


PyMailGUI: Enhancements Summary, Screenshots

Short story: this section includes a now-dated summary of post-publication changes to PyMailGUI, followed by updates on that program's later evolution. Be sure to also see this section's updates ahead for more recent information.

PyMailGUI Post-Publication Changes Summary

[Oct-1-11] After using the book's PyMailGUI email client for just over a year, I've collected a list of additional enhancements beyond those already described in the book (see the original enhancements list at the end of PyMailGUI's Chapter 14). I am the entire testing department and user base for this program, so some issues have taken longer to shake out than others. The following is a list of all these additional PyMailGUI enhancements discovered and applied after the book was published, for completeness; their write-ups are located elsewhere on this page:

1 Feb-01-11 Using POP and SMPT timeout parameters (patched in 1.2, and book) write-up
2 Jan-10-11 Closing temporary output files for HTML-only emails (patched in 1.2, and book) write-up
3 Aug-08-11 Decoding and encoding non-ASCII attachment filenames (patched in 1.3, and book) write-up
4 Oct-01-11 Improved sent-time display in list windows (patched in 1.3) write-up
5 Sep-29-11 Delete and Save timing issue, rare bug (patched in 1.3) write-up
6 Jul-29-11 Using authenticating SMTP servers for sends in mailconfig (patched in 1.3) write-up

Interestingly, two of these changes, #1 and #3, are also inherited by the less functional PyMailCGI webmail example of Chapter 16, because they were applied in the common mailtools package. There were a handful of additional changes made in the examples package and their book listings (e.g., a focus fix in the PyEdit component used by PyMailGUI); see the change log as well as the changes' write-ups on this page for more details.

Screenshots

To sample the effect of changes #3 and #4 above, see the following PyMailGUI screenshots:

The support for non-ASCII attachment filenames and local-relative time in these is new in the 1.3 example package, but the rest is original behavior. See the book for more on PyMailGUI's i18n and Unicode support in other headers and mail content.

Update: New Examples Release 1.3

[Oct-19-11] I've posted a new release of the book examples package, version 1.3, which has patches for all 6 of the PyMailGUI updates listed in this section above. The first two of these were already patched in release 1.2, but the rest are new in 1.3. Get the change log and the complete new examples zip file at O'Reilly's site, or fetch just the files changed within it in this zip file. See the book for details on running PyMailGUI in the examples package (via auto launchers, command lines, etc.). I don't distribute this program standalone, partly because it uses many other files in the book examples tree, and partly because the book is its documentation (but see the later update ahead).

One admin note: Some of the changes made in version 1.3 of the examples package are too large to find their way into reprints of the book itself, but I recommend using the new version in general, and studying the files changed to see what was involved; it's a fair example of code maintenance in action. For changes too big to merge into the book, versions of the changed source files which mirror the code in the book are retained in the examples package with a "BOOK-" name prefix. For details, see the change log which is also file changes\CHANGES.txt in the examples package.

Update: Patch for Running on Python 3.3

[Sep-15-13] There is a simple patch for an address-display issue introduced by a change in Python 3.3's email package, that is included in version 1.4 of the examples package. For details, see the bug, as well as its fix.

Update: PyMailGUI (and PyEdit) Standalone Release

[2018] PyMailGUI eventually was released as a standalone product (apart from the book) in 2017, with numerous enhancements not noted here, and ports to Linux and Mac OS. You can read a summary of all its post-publication changes here, and trace its evolution at its home page and user guide. The related PyEdit program was similarly upgraded, ported, and packaged at the same time; it's post-publication changes history is chronicled here. Book readers are encouraged to begin by studying the base versions of these programs in the book, and move on to explore the new standalone versions' code later. The newer versions are more polished for general use and portability, but retain the original versions' core ideas.

[Back to Index]


Python 3.2.0 Breaks Scripts Using input() on Windows (LP4E)

[Jan-4-12] If a book example which uses the input() built-in seems to be failing, and you are using Python 3.2.0 in a Windows console window, see this post on Learning Python 4E's update pages.

This built-in was apparently broken temporarily in 3.2.0 (3.2) in Windows console mode, but has been fixed in later Python releases. The quickest fix is to upgrade to 3.2.1 or later, or try a different environment; the book examples work fine in all other Pythons and most other contexts such as IDLE. Scripts in both books may be impacted by this regression.

[Back to Index]


Python 3.2 Removes struct.pack Functionality for str (LP4E)

[Jan-11-12] Another cross-post from the Learning Python update pages about a Python change which impacts examples in Programming Python too—see this note for details on Python 3.2's decision to drop support of str strings for the "s" type code in struct.pack.

This impacts a variety of examples in this book. The simplest fix is to manually encode str Unicode strings to bytes byte strings when passing to struct.pack, per the referenced note. You can also run these examples in 3.1 or earlier if that's an option, though newer Pythons are generally better Pythons.

[Back to Index]


Python 3.3, and Its Impacts on Book Examples

Short story: this section summarizes changes in Python 3.3, and relates them to the book's content. As usual, code that aims to illustrate the bleeding edge is generally among its first casualties.

An Overview of Python 3.3

[Oct-2012] I've started testing the book's examples under Python 3.3, the latest release which features:

Plus more changes I won't list or repeat here. See my earlier 3.3 preview, on the Learning Python update pages, as well as the official 3.3 What's New document at python.org for additional details on 3.3 changes.

More on the New Windows Launcher: Off Page

Of these, the last 3.3 enhancement listed above will probably have the broadest impact (in fact, it affects every Python 3.3+ user on Windows, which, for better or worse, is a huge audience), and merits a few more words. Those words have grown too large for this page, however, so I've moved them to this separate article:

The New Windows Launcher in Python 3.3

The very short story on the launcher is that it registers new executables which are installed on your system path normally; attempts to parse #! Unix-style lines at the top of scripts to determine which version of Python run; and supports command-line arguments that give Python version numbers. The net effect is to better support multiple Pythons coexisting on the same machine, by allowing Python version numbers to be specified on both a per-file and per-command-line basis, and in both full and partial form. It's quite a useful trick, though not without the pitfalls described below. For much more on the launcher in general, see the link above, or the new appendix on the subject in 2013's Learning Python, 5th Edition.

Python 3.3's Impacts on Book Examples

PP4E's book examples were initially developed on 3.1, but tested successfully on 3.2 alpha before publication. In general, most examples tested so far appear to work well on 3.3 and as shown in the book. As expected, though, the evolution in the 3.X line has impacted some behaviors. Among the most notable 3.3 changes that affect book examples:

More here on 3.3 in general as testing continues. You can also read about earlier Python 3.2 changes here, and later 3.4+ changes here.

[Back to Index]


Running PIL Examples on Python 3.2 and Later: Pillow

Short story: PIL became the largely compatible Pillow, which is still actively supported, and freely available at the standard PyPI website. Fetch and install the Pillow drop-in replacement to run the book's PIL examples, per this section's updates.

The Issue

[Apr-21-12] This book uses the PIL (Python Imaging Library) extension for some image-based examples, both to render thumbnail images, and to display additional image file types in tkinter GUIs. Because PIL was not yet ported to 3.X, the book employed a custom installer provided by PIL's creator, and included this installer in its examples package as a temporary measure pending an official 3.X port.

A reader wrote recently to note that the PIL installer in the book's examples package works only under Python 3.1, and not for 3.2. I don't track PIL's progress, but it has much more utility than the book leverages, and I suspect that this has held up the 3.X port (naturally, this is a non-issue for 2.X readers, for which PIL installers are available). Since this is a general issue which other readers have asked about too, the relevant portion of my reply follows:

About a PIL installer for 3.2: an official 3.X PIL port has 
yet to materialize; it was considered imminent two years ago.
The stop-gap installer I was given by PIL's creator and shipped
in the book examples package is an executable for 3.1 only, 
which I unfortunately have no way to update.

I recommend contacting PIL's creator, F. Lundh, about
this, and/or browsing the archives of and posting your query 
to PIL's email list to see what may be possible today.  

PIL's creator's last known email address (two years ago):

    (please search pythonware.com)

and the image-sig email list for PIL lives here:
 
    http://mail.python.org/mailman/listinfo/image-sig

If you get a resolution on this and can spare the time, I'd
appreciate a copy on what you find; other readers have run 
into the same issue.  If I'm able to uncover anything myself,
I'll follow-up. 

In the worst case, you can always install 3.1 alongside 3.2 
to experiment with PIL examples, or take the examples' code 
as demonstrative if not runnable. 

Update, May 2012 A web search turns up unofficial PIL installers for Python 3.2 and 3.3, including those at this site, though I have yet to test their operation with book examples.

Update, July 2012 It's now been verified that the "unofficial" PIL ports for Python 3.X described in the prior update do work correctly, at least on Windows under Python 3.2 and 3.3 and for the PIL subset used by the book's examples—tkinter image display, thumbnail generation, and resize operations. Specifically:

Presumably, the 32-bit installers and those for other versions at this site also work as expected. Thanks to both those who took time to port and post PIL, as well as readers who asked me about this issue.

Update, May 2013 It looks like the PIL ports at the site described above are now an official fork named Pillow, and are now also available at the PyPI site. Despite the name and location changes, this package is still imported as module PIL, and is fully compatible with PIL for the book's examples, and others. I've also used it successfully to extract EXIF metadata tags from photos in this script (tagpix.py).

[Back to Index]


More on pickle Module Constraints: Bound Methods (LP4E)

[Oct-15-11] I posted a note about pickling and bound methods on the clarifications page of the book Learning Python 4th Edition which provides some additional background on how and why bound methods cannot be pickled.

Since that note also pertains to the coverage of pickling in this book—in Chapter 1's quick tour, Chapter 5's multiprocessing module section, and Chapter 17's in-depth database material—I'm posting a cross reference to it here too: read this related note here.

[Back to Index]


More on Printing Non-ASCII Filenames: PYTHONIOENCODING

Short story: Be sure to set your PYTHONIOENCODING environment variable to support Unicode output on platforms that require this (and in Python scripts that ignore this).

The Issue

[Oct-28-11] On pages 279-282, the book discusses in substantial depth how print calls can fail for some Unicode filenames in Python 3.X, and uses a try statement to catch such failures in the tree walker script on page 276. As described in the book, scripts should generally print filenames with care in 3.X if those names might ever be non-ASCII, by either catching print call exceptions or changing the PYTHONIOENCODING environment variable to allow for specific Unicode encodings in the standard streams. For simplicity, though, some of the scripts in the book do not take this advice: they simply print filenames blindly, and assume that you understand the issue given the general description starting at page 279, and will configure your environment if/as needed.

Filename Print Failures

As an illustrative example, I recently noticed a print failure in the diffall.py directory comparison script on pages 311-313 after I added a few files with non-ASCII names to the examples tree (tests for PyMailGUI enhancements described elsewhere on this page). As is and without environment configurations, the book's version of the script dies with an exception on Windows when printing non-ASCII filenames, even if the output is redirected to a file (which sometimes avoids such errors):

c:\...\PP4E\System\Filetools>diffall.py e: C:\SD-card-xfer-oct2711 > temp
Traceback (most recent call last):
  File "C:\...\PP4E\System\Filetools\diffall.py", line 80, in <module>
    comparetrees(dir1, dir2, diffs, True)      # changes diffs in-place
  File "C:\...\PP4E\System\Filetools\diffall.py", line 69, in comparetrees
    comparetrees(path1, path2, diffs, verbose)
  ...
  File "C:\...\PP4E\System\Filetools\diffall.py", line 56, in comparetrees
    if verbose: print(name, 'matches')
  File "C:\Python31\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-6: character maps to <undefined>

The end of the temp output file reflects the location of the failure:

--------------------
Comparing e:Books\4E\PP4E\examples-official\1.3\unpacked\PP4E-Examples-1.3\changes\detailed-diffs\1.3\patched-files-13\i18n-filenames-tests to [...]
Directory lists are identical
Comparing contents
Mail-saved-after-sent--OpenMeInGUI.txt matches

Fix 1: Setting PYTHONIOENCODING

Now, as suggested in the book's footnote on page 282, this script works as is without errors if you simply change your PYTHONIOENCODING environment variable to UTF-8 Unicode encoding for the standard streams. No code changes are needed, and the script winds up printing Unicode text to the output file. On Windows (where you also set this once and for all via the System icon in your Control Panel):

c:\...\PP4E\System\Filetools>set PYTHONIOENCODING=utf-8
c:\...\PP4E\System\Filetools>diffall.py e: C:\SD-card-xfer-oct2711 > temp
c:\...\PP4E\System\Filetools>notepad temp

Here is part of the temp output file, at the place where the prints failed before the environment setting:

--------------------
Comparing e:Books\4E\PP4E\examples-official\1.3\unpacked\PP4E-Examples-1.3\changes\detailed-diffs\1.3\patched-files-13\i18n-filenames-tests to [...]
Directory lists are identical
Comparing contents
Mail-saved-after-sent--OpenMeInGUI.txt matches
Поворот IMG_1412.txt matches
从~技~术~走~向~管~理.xls matches
金牌销售2天一夜实战训练.xls matches
--------------------

Fix 2: Catching Encoding Exceptions

Alternatively, it might be a bit more robust and convenient in some scenarios to catch the exception in the script itself and print as raw bytes, instead of printing Unicode text that must be encodable per the print setting. A modified version of this script that does this and works without environment setting changes is available here: diffall-SAFE.py. Here are the parts added and modified in diffall.py:

def tryprint(*args):
    """
    Added Oct-27-11, post publication and post examples 1.3:
    Don't fail with an exception for unprintable filenames;
    See pages 279-282, and the similar tryprint on page 276;
    Started failing for non-ASCII filenames in email test dirs;
    In general, any filename printers in 3.X may require this,
    unless PYTHONIOENCODING is set as needed (e.g., to utf-8);
    """
    try:
        print(*args)                 # filenames might fail to encode
    except UnicodeEncodeError:
        print('--UNPRINTABLE FILE NAME--', *(arg.encode() for arg in args))

def comparetrees(dir1, dir2, diffs, verbose=False):
    ...
                if (not bytes1) and (not bytes2):
                    if verbose: tryprint(name, 'matches')
                    ...
                if bytes1 != bytes2:
                    diffs.append('files differ at %s - %s' % (path1, path2))
                    tryprint(name, 'DIFFERS')
    ...
    for name in missed:
        diffs.append('files missed at %s - %s: %s' % (dir1, dir2, name))
        tryprint(name, 'DIFFERS')

if __name__ == '__main__':
...
        for diff in diffs: tryprint('-', diff)

This changed script still makes some assumptions (it must be able to encode the filename per the platform default, which is UTF-8 on Windows) and may need further honing on some platforms. When run, though, it avoids print failures explicitly; here's what it does with non-ASCII filenames in the printed output in the temp file:

--------------------
Comparing e:Books\4E\PP4E\examples-official\1.3\unpacked\PP4E-Examples-1.3\changes\detailed-diffs\1.3\patched-files-13\i18n-filenames-tests to [...]
Directory lists are identical
Comparing contents
Mail-saved-after-sent--OpenMeInGUI.txt matches
--UNPRINTABLE FILE NAME-- b'\xd0\x9f\xd0\xbe\xd0\xb2\xd0\xbe\xd1\x80\xd0\xbe\xd1\x82 IMG_1412.txt' b'matches'
--UNPRINTABLE FILE NAME-- b'\xe4\xbb\x8e~\xe6\x8a\x80~\xe6\x9c\xaf~\xe8\xb5\xb0~\xe5\x90\x91~\xe7\xae\xa1~\xe7\x90\x86.xls' b'matches'
--UNPRINTABLE FILE NAME-- b'\xe9\x87\x91\xe7\x89\x8c\xe9\x94\x80\xe5\x94\xae2\xe5\xa4\xa9\xe4\xb8\x80\xe5\xa4\x9c\xe5\xae\x9e[...]\x83.xls' b'matches'
--------------------

This works, but it many cases it might be simpler and will require much less code to set your PYTHONIOENCODING as needed, rather than trying to safeguard all your filename prints with try statements on the off chance that they may someday fail. Even in the diffall.py case, other modules that this script uses could potentially fail on filename prints too, and may require additional error trapping code unless we use the simpler and broader PYTHONIOENCODING scheme.

I'm not going to mark this for changing in reprints or new example releases, partly because this is a general issue which is already well documented in the book; partly because this is not the only book example that takes a loose approach to printing filenames; and mostly because such scripts will work without changes if you set your PYTHONIOENCODING as needed. In other words, one can make a very strong case that this is more an operational issue than a program bug, and so does not merit a code change. As usual, if your scripts fail on filename prints in Python 3.X, fix as prescribed.

Update: New Stream Defaults, ascii(), Mac OS

[2017] As a postscript, the most recent Pythons have changed the default output stream encodings to use UTF-8 on Windows, so you may not need to set your PYTHONIOENCODING if you're using one of the latest releases. See Python's What's New documents for details.

Additionally, the built-in ascii(x) function can be used to escape non-ASCII text for printing to streams if needed—an alternative to the to-bytes encode() used above, and utilized on some platforms by the later Mergeall system, which was the descendant of the book's diffall and cpall scripts. See help(ascii) for details.

It's also worth noting that this issue seems most grievous on Windows. Mac OS, for example, happily and naturally supports Unicode output in its Terminal app, no environment changes required. But the world keeps using Windows anyhow...

[Back to Index]


Bonus Example: Website Admin, with HTML and URL Parsing, FTP

cleansite is a book-related example originally posted here in 2010 and updated in 2014. It recursively walks a website's links to locate unused files, by parsing the HTML files accessible from one or more root pages, and noting the files of any type they reference. Its goal is to help locate unused files that may be removed, while leveraging multiple topics covered in the book.

Latest Version: Broader Usage Roles

[Feb-19-14] The latest version of this script, 1.1, is available in this zipfile: cleansite11.zip. The updated script, cleansite.py, now handles parsing of non-ASCII Unicode HTML files better; catches unused local files that have the same name as a remote site's version named in a link; and is a bit more coherent on parameter selection.

The zipfile also has new example run logs, and updated versions of the book's downloadflat_modular.py and uploadflat_modular.py FTP utility examples, updated to skip local and remote directories (now present in their author's sites). Typical usage: run the download script to copy the site's top level; run cleansite to move likely unused files to a subdirectory; manually inspect and adjust as needed (e.g., restore favicon.ico); and run the upload script to clear the server site and upload the used files.

These scripts were used to purge old, unreferenced files from my websites—some now nearly two decades old—by adding pages to ignore until old material was discarded as desired. For the target site, cleansite caught 115 unused files among 317 total, and shaved 3G off its 10G total size. You may need to augment this with multiple-file searches (see the Grep tool in the book's PyEdit GUI), and this is still a bit user-specific, but serves its purpose for my use cases.

Original Version and Description

[Dec-28-10] As a sort of reward for stumbling onto this page, I've uploaded an extra example script here which would have appeared with the book's HTML parsing coverage in Chapter 19, had this book project enjoyed unlimited time and space. This script uses Python's HTML and URL parsing tools to try to isolate all the unused files in a website's directory.

I use this script for my training website, as well as my book support site (the latter after fixing some HTML errors that rendered the script inaccurate when Python's strict HTML parser failed and caused some used files to be missed). This script also includes code to delete the unused files from a remote site by FTP if you wish to enable it (pending a resolution on the parser failures issue), and includes suggestions for parsing with pattern matching instead of the HTML parser.

Download: The script itself lives here: cleansite.py. To see what it does, read its docstring, and see two sample runs provided in a zip file here: testruns.zip. See also the newer, enhanced version of this script at the start of this section.

[Back to Index]


Bonus Example: Book lottery, with POP, SMTP, email, and CGI

pylotto is a book-related script used to give away books to randomly selected students in some of the live classes I teach (sorry, its lotteries are not open to the web at large!). It ties together a number of the book's topics and techniques, and has been posted here in both original and updated forms.

Latest Version: More Usage Modes

[Feb-22-11] In light of security constraints in a recent class, I've completely rewritten the pylotto script for worst-case scenarios. It can now select from both emails, and a names file created manually or via web form submits, and can be run in both console and remote CGI modes. In pathological cases, it can be run locally to select from a local names file—the need for which was underscored by a recent class somewhere in the wilds of California. I've also updated to port the single script to work on both Python 3.X and 2.X, and to properly escape student names in the reply HTML.

Here's the new version's code—use the "view source" option to view the form's HTML:

Read the script's docstring for all the details. Among other things, it demonstrates how to implement simple state retention in CGI scripts, using flat files and locks for possibly concurrent updates to the players file. You can also view the sign-up instructions files: lotto-howto-email.txt and lotto-howto-web.txt, as well as a last-resort script which became a class exercise and gave rise to (and is subsumed by) this new generalized Pylotto: simple-pylotto.py.

The rest of this section describes the original, now defunct version, but also gives the back story. The new version has the same goals, but supports web form and local file sign-up modes in addition to emails.

Original Version and Description

[Jan-11-11] Here's another supplemental example that might have appeared in the book, if not for time and its conceptual dependencies. I wrote this script, pylotto, in order to give away free books in some of the classes I teach. O'Reilly always sends a batch of free copies to authors, and if I kept a dozen copies of every one of the 12 Python books I've written, some of which are not exactly small, I'd probably need a bigger house.

To enter the book lottery, students send an email message to the book's account, with "PYLOTTO" in the subject line. At the end of the class, the script scans, parses, and deletes these emails, and selects a set of their "From" addresses at random as winners. It's not Vegas, but it's fair, and serves as a nice example of practical Python programming that ties together a number of tools presented in the book. This script also has a test mode that sends test emails, and an as-CGI mode for running on a Web server if the training site doesn't allow POP email access or SSH (many don't).

Download: Fetch the script here: pylotto-orig.py. To see what it does, read its docstring, and see the text file that traces its outputs in its various modes here: pylotto-orig-run.txt.

New: See also pylotto-orig-24.py, a version modified to run remotely as a CGI script on a Python 2.4 web server (that's the latest Python available on godaddy.com, as of January 2011!).

Naturally, I don't give away books in every class I teach (this basically depends on how many freebies O'Reilly has given to me, and how willing I am to lug around a big, giant book in my checked luggage). Even so, scripts such as this one and others in the book which address real, practical needs can go far to help illustrate Python applications in action once students or readers have mastered Python language fundamentals. As stated in the book, Python tends to become a sort of enabling technology for most people who've learned to use it well.

[Back to Index]


Bonus Example: Extracting Music Files from an iTunes Folder Tree

flatten-itunes is a book-related example that copies all music files in a folder tree to a single, flat folder for easier access and transfer, and puts some of the book's topics to work on a realistic task. It's been released here in both initial and revised forms.

For reference, this section references the following code:

Latest Version: Resolving Disparate Collections

[Nov-5-11] I've written a new and very different version of this script to be used to isolate differences in iTunes collections on different laptops or archives, rather than just flatten directories and detect protected files. Grab the new 2.X/3.X version here: flatten-itunes-2.py. Though it targets iTunes trees, it can be used to normalize (i.e., flatten) any folder tree containing music files (e.g., unzipped Amazon MP3 downloads).

I wrote this version to help resolve differences between iTunes collections on multiple machines that have fallen out of sync over time. It's easy to get in this state if you buy songs on whatever laptop you have at the moment and don't synchronize religiously (cloud storage addresses this in theory, though not without potential downsides of its own). Because iTunes may accumulate different files and directory structures on different machines, it's nearly impossible to synchronize unless you normalize its files into a simpler, uniform structure.

This new version addresses this by sorting all music files in an entire directory tree into 4 flat directories (playable, protected, irrelevant, and other), for each collection it is run against. This makes it simpler both to run later comparisons to spot differences (e.g., using the book's dirdiff.py or more in-depth diffall.py scripts of Chapter 6), and to copy merged collections from device to device. This version also retains all files in the tree; renames duplicates with a numeric suffix; produces a richer report; and was updated to run on both Python 2.X and 3.X. It's safe to run against an iTunes location just to experiment, because it only copies files from there, and does not modify the iTunes tree itself in any way.

After collecting files into flat directories with this script, I later merge the playable files directories with manual drag-and-drop operations, and also run a simpler script, rename-itunes.py, on the result to strip the leading track/disk numbers at the front of some filenames, making it easier to compare and sort, and isolate more duplicates (e.g., "02 xxx.mp3" and "xxx.mp3"). The end result is a single, flat directory of song files to use on multiple devices.

Note: this is only an iTunes (or other music library) utility, for analyzing your collections' content or moving your music files. It's not a player or iTunes replacement, and although its result directories retain all iTunes music files, they may not retain all iTunes information. For example, due to the way the script flattens music directory trees, its result directories may lose some associations between music files and their album artwork images, at least for images not embedded in music files themselves (e.g., via ID3v2 frames for MP3 files). Writing a full replacement for iTunes in Python (PyTunes?) would be an interesting project, and others have already made progress along these lines—check out:

and lots more in the results of any web search for "Python MP3 player," "Python iTunes," or the like. It's a rich domain, but I've already spent more time on this script than I meant to.

Original Version and Description

[Feb-3-11] Here's something similarly practical, but a bit simpler than the prior two sections' programs—a Python script which walks all the folders and subfolders in an iTunes directory tree, to copy all music files in the tree to a single flat directory. I use this to create a single directory of all my music on a memory stick, so it can be used conveniently on the harddrive in a vehicle I drive. iTunes seems fond of nested directories, and the player in the vehicle in question doesn't do well in their presence. This example might have appeared in the larger systems examples chapter, if not for time, space, and the fact that it's not too much different from tree-walkers already in that chapter.

Download: Fetch the script here: flatten-itunes.py. To see what it does, read its code and docstring, and see the text file that traces its outputs here: flatten-itunes.out.txt.

Related Script: tagpix

For another directory-walker media tool that similarly organizes media content spread across folder trees, see also tagpix, which extracts EXIF metadata tags from photos with PIL, and is referenced elsewhere on this page.

[Back to Index]


Changes in Python 3.2

Short story: this section deals with prominent changes in Python 3.2, with a focus on email-processing tools that impact the book's examples.

An Overview of Python 3.2

[Dec-31-10] As described in its Preface, this book was written under Python 3.1, and its major examples were retested and verified to work under Python 3.2 alpha just before publication. Because of that, this book is technically based on both 3.1 and 3.2, though it addresses the entire 3.X line in general.

That said, you will find some discussion of 3.1 library issues in the book that have changed or improved in the upcoming 3.2 version, which is due to be released roughly two months after this book's release date (3.2 final is currently scheduled for mid-February 2011). Some of the issues in 3.1's email package which the book must workaround, for instance, have been improved or repaired in 3.2.

In fact, many or most of the issues of the 3.1 email package described in Chapter 13 are fixed in 3.2. The email workarounds coded in that chapter still work under 3.2 (and were verified and even enhanced to do so before publication), but some are no longer required with 3.2. Notably, the email package in 3.2 now supports parsing the raw bytes returned by the SMTP module, thereby eliminating the need for the partially heuristic and potentially error prone pre-parse decoding to str that the book's 3.1-based examples must perform. The next section explains how this works in 3.2.

email and bytes in 3.2: the Surrogates Replacement Trick

As a prominent example of email's improvements, 3.2's What's New document states that the 3.2 email package's "new functions message_from_bytes() and message_from_binary_file(), and new classes BytesFeedParser and BytesParser allow binary message data to be parsed into model objects." Interestingly, the 3.2 email parser still does not parse bytes internally. Instead, these extensions work their magic by decoding raw binary bytes data to Unicode str text prior to parsing, using the ASCII encoding and passing string "surrogateescape" for the decoding call's errors flag.

In short, the surrogateescape error-replacement scheme translates undecodable bytes to Unicode codepoint escape sequences, which allow the bytes' original values to be recovered when the text is encoded back again to bytes by compatible software. When parsed message parts are later fetched through the Message API, re-encoding back to binary form with the same errors replacement scheme is expected to restore the original data. At least potentially, this arguably clever trick could resolve the initial decode-to-str issue for parsing email messages in Chapter 13.

On the downside, because this scheme assumes that message data is both decoded to Unicode text and re-encoded to bytes later using the surrogateescape error handler for both steps, this trick works for data passed through Python's APIs which follow this translation protocol, but can fail for data which is not. Moreover, this scheme also assumes that any data mangled by the surrogates-replacement step is not significant to the parser's analysis, as it might not match expected characters in the stream—a non-issue for binary data or encoded text parts, but potentially significant for some forms of full-message raw text (though non-ASCII bytes are unlikely to mean much to a message parser in any form).

Also note that while this change may be a first step towards addressing the related CGI uploads issue described in Chapters 15 and 16, that issue still exists in Python 3.2. As described in the book, CGI uploads are somewhat broken in 3.X today because Python's CGI module uses the email parser, but its uploaded data can be arbitrary combinations of both binary data and text of a variety of Unicode encodings, with or without MIME encodings and content type headers. Such data cannot be decoded to str in 3.1 as required by its email parser. Unfortunately, the CGI module in Python 3.2 still uses the str-based email parsing API, not the new bytes-based API, so this CGI uploads limitation appears to still be present in 3.2. I verified that this is the case in 3.2 final: cgi does not use email's new bytes parser interface, but still performs a pre-parse decoding from bytes to str per UTF-8, which may fail for some data streams. A resolution to this appears to await a future Python.

For email, though, 3.2's library fixes represent a significant improvement over 3.1: the decode-to-str preparse issue for email, as well as other Chapter 13 email package workarounds, may have been rendered superfluous in 3.2. On the other hand, the book's 3.1 workarounds code is harmless under 3.2, and is representative of the sorts of dilemmas faced by real-world development in general—a major theme of this book. Unless you're lucky enough to use the same version of software for all time, change is probably an inevitable part of your job description too.

Other 3.2 Changes

For more on the Python 3.2 release, including its new __pycache__ subdirectory bytecode storage model, please see its note on this site in the Learning Python 4E updates page (a book less impacted by 3.2, since 3.2 was supposed to change only libraries, not core language—and nearly succeeded).

Not covered in that note is the very late 3.2 addition of its concurrent.futures library. This library, based upon a Java package, provides yet another way to generalize the notion of multitasking with threads and processes, in addition to the existing subprocess and multiprocessing modules which are covered in this book. This new library is also a bit of a work in progress, intended for future expansion. For more details, see 3.2 release details and manuals. (Python 3.X's later evolution continued to explore parallel programming but focused on asynchronous coroutines, which many viewed as narrow, complex, and controversial; see this essay.)

While you're at the Learning Python site, see also its preview of mid-2012's expected Python 3.3.

Update, Mar-2011 After additional research, it's now clear that the 3.2 email changes do indeed bear specifically on the Chapter 13 discussion of Unicode and email that starts on page 926. Most notably, the preparse decode from raw bytes to Unicode str needed in 3.1 is no longer required (but is harmless) in 3.2, because the email package can now parse bytes data directly, using the errors-replacement scheme described above. I'd mark this as an insert for future reprints, but the book can't possibly track all future Python changes as easily as this page (especially with a new and possibly incompatible email package under development!).

Update, 2018 Be sure to also see the newer Python Changes 2014+ page for later Python changes not mentioned here. That page covers Pythons 3.4+, but its preamble also has links for Python changes that cropped up between PP4E's publication and 2014 (post LP5E), including the later 3.3 coverage here.

[Back to Index]


Adjusting to Local Time in PyMailGUI List Windows (Patched in 1.3)

Short story: the original version of PyMailGUI in the book didn't scale email times to local (display) time, but this was added to the version in the book's examples package early on using coding techniques described here.

The Issue

[Oct-1-11] As originally coded, message-sent time is not displayed very usefully in PyMailGUI's list windows. The GUI blindly displays the full, raw text of the Date header field. Worse, the time portion of this header is truncated by the display such that the "+NNNN" field which denotes what the sent time really is relative to GMT is not shown. The net effect is that you can't tell in the GUI when a message was sent or which messages were sent before others without looking at the raw text and deciphering the Date time string manually. The GUI lists emails in order received at the POP mail server only.

The Fix

To fix, the time in the Date field of list windows should be shown in full, and be shown relative to either the local time zone or GMT uniformly for all emails received. See the Python email package for pointers and possible tools; this doesn't seem crucial enough to detail the code fix here. As a hint, though, formatting dates for use in new mails can be either relative to GMT or the local time zone:

>>> from email.utils import formatdate      # in Python 3.2

>>> formatdate()
'Thu, 29 Sep 2011 17:27:53 -0000'           # relative to gmt

>>> formatdate(localtime=True)
'Thu, 29 Sep 2011 13:27:55 -0400'           # relative to local (us eastern, -4 hours) 

>>> formatdate(usegmt=True)
'Thu, 29 Sep 2011 17:27:59 GMT'             # explicit gmt relative for http

Applying the corresponding technique for adjusting a received date/time string to the local time zone in PyMailGUI's code is officially delegated to suggested exercise, but the following might help. To convert a GMT-based date/time string to a date/time string in the local US Eastern time zone, try this (it's 5:45 PM GMT and 1:45 PM locally):

>>> from email.utils import formatdate
>>> from email._parseaddr import parsedate_tz, mktime_tz

>>> now = formatdate()                               # gmt-based => local (eastern)
>>> now                                         
'Thu, 29 Sep 2011 17:45:26 -0000'
>>> parsedate_tz(now)                                # time string => time tuple
(2011, 9, 29, 17, 45, 26, 0, 1, -1, 0)
>>> mktime_tz(parsedate_tz(now))                     # time tuple => to utc timestamp
1317318326.0

>>> formatdate(mktime_tz(parsedate_tz(now)))         # utc timestamp => time string
'Thu, 29 Sep 2011 17:45:26 -0000'
>>> formatdate(mktime_tz(parsedate_tz(now)), localtime=True)
'Thu, 29 Sep 2011 13:45:26 -0400'

Using this scheme to convert from local to GMT, or local to local could proceed as follows:

>>> here = formatdate(localtime=True)                # eastern => gmt or local (eastern)
>>> here
'Thu, 29 Sep 2011 13:45:47 -0400'
>>> formatdate(mktime_tz(parsedate_tz(here)))
'Thu, 29 Sep 2011 17:45:47 -0000'
>>> formatdate(mktime_tz(parsedate_tz(here)), localtime=True)
'Thu, 29 Sep 2011 13:45:47 -0400'

And finally, converting a date/time string from the US Pacific time zone to either GMT or the local US Eastern time zone might be done this way (it's now 10:46 AM Pacific, 5:46 PM GMT, and 1:46 Eastern/local):

>>> there = 'Thu, 29 Sep 2011 10:46:32 -0700'        # pacific => gmt or local (eastern)
>>> formatdate(mktime_tz(parsedate_tz(there)))
'Thu, 29 Sep 2011 17:46:32 -0000'
>>> formatdate(mktime_tz(parsedate_tz(there)), localtime=True)
'Thu, 29 Sep 2011 13:46:32 -0400'

Update, Oct-19-11 Because the patch was minor, I wound up fixing this in the examples package after all: see the release description above for more on version 1.3 of the book examples package in which this fix first appears. As usual, the fix was propagated to all later PyMailGUIs. The code edits were too large to add to the book itself, unfortunately, so fetch the examples package for more details.

[Back to Index]


Rare Delete and Save Timing Issue in PyMailGUI (Patched in 1.3)

Short story: Though unlikely, it's possible that the original PyMailGUI's delete (the GUI's Delete) may delete the wrong message if you request a new delete while one is already running. Either don't do this; read on to see how to fix this potential issue in code yourself; or get the latest book examples package or PyMailGUI release which incorporated fixes for the issue long ago.

The Issue

[Sep-29-11] PyMailGUI's initial version included an obscure timing issue related to delete operations. As coded in the book, it's not impossible that a deletion thread's exit action may be allowed to run and clear the delete-in-progress flag, before a new delete request has a chance to check this flag. This can occur apparently because the new delete's confirmation dialog popup releases control to the GUI event stream, which can then run a prior delete's exit action from an after() timer callback event. Unfortunately, the new delete issues the confirmation dialog before checking the delete-in-progress flag, and after fetching message numbers to be deleted from the GUI.

The net result is that a new delete might overlap with one in progress, and incorrectly delete the wrong POP message numbers made invalid in the GUI by the prior delete—a scenario the book and code both explicitly state must be avoided. Note that the system does go to great lengths to compare mail headers so as to ensure that each message being deleted in the GUI matches the message being deleted on the server once deletions begin, in case the server's inbox changes before a selected message is deleted (see method deleteMessagesSafely in mailFetcher.py). That doesn't help in this scenario, though, because the GUI client's mail list has been updated after selected message numbers were fetched, such that the prior selections no longer match the GUI or the server.

This behavior is timing dependent, rare, and can occur only if you issue a new delete request while one is already in progress, and then only if you're unlucky enough to have the prior request's exit action run exactly after the time you press Delete for the new request and before you're able to click "OK" in the delete confirmation popup. However, this is also a classic and even illustrative timing bug; it reflects both the lack of broader testing for a book's examples, and a misconception of the confirmation dialog's modality—the program assumed this dialog was truly modal (blocking), such that all the code from the start of a new delete callback handler through its delete-in-progress test ran atomically. This must not be case, as I've seen a few incorrect mails deleted when running many deletes in parallel.

The Fix

To avoid this potential entirely, don't run overlapping delete requests. Better still, fix the code to avoid it in all cases by running the delete-in-progress test immediately in the delete callback handler, and before the confirmation dialog is issued. Because the delete-in-progress test logic differs between the server and file list windows, this can be done by either moving the confirmation dialog call into each subclass's code, or adding a subclass-specific okay-to-delete method called from the superclass delete callback code.

For more details, see onDeleteMail in the ListWindow class on page 1074, as well as its two subclass's doDelete methods on pages 1079 and 1083. I may patch this in the examples package eventually, and in a book reprint if possible; for now, consider it a maintenance exercise, as well as a lesson on both the need for rigorous testing and the complexity of code that may overlap in time.

Update, Oct-19-11 Given its potential severity, I wound up patching and fixing this issue in version 1.3 of the book examples package. The fix, present in all later PyMailGUIs, prohibits delete overlaps via the code changes described above; see the release description above. There appeared to also be a similar potential for timing issues for Saves due to their file-selection dialog, which was also patched to check for blocking state before the dialog instead of after, though this seemed much less likely or harmful. Regrettably, these fixes were too large to add to the book itself.

[Back to Index]


More on Encoded Email-Attachment Filenames (Patched in 1.3)

Short story: this section describes a limitation in the original version of PyMailGUI in the book, which was later lifted in both subsequent releases of the book's examples package, and standalone PyMailGUI packages. Its resolution story told here, though, is illustrative of both Unicode issues and code maintenance at large.

The Issue

[Aug-8-11] After presenting the PyMailGUI client, Chapter 14 discusses a variety of suggested improvements to this system. Among them, its Unicode enhancements section on page 1123 mentions that the filenames of attached parts might also be in i18N encoded form in some rare cases, and require the same MIME and Unicode decoding steps that are already applied to other primary email headers such as Subject, From, and To. In the book's PyMailGUI the latter of these are properly decoded for display and encoded for sends, but attachment filenames are currently not.

Encoded attachment filenames weren't present in the test cases used to develop this version of PyMailGUI, so they received only a brief mention in the improvements list. Moreover, some minor Unicode issues were intentionally given limited attention, partly because this book's size and time constraints limit its scope, but also because this edition uses the Python 3.1 email package which has well-known Unicode issues and limitations described in the text.

Lately, however, I've noticed that encoded filenames are becoming more common. This doesn't seem like a great feature of email in general—Unicode filenames that are encoded in a way not supported by the receiving platform's filesystem won't work and would have to be renamed automatically (e.g., sending a Russian or Chinese filename may fail when saved on an ASCII-only filesystem). Because of the increasing prevalence of such emails, though, I want to elaborate on ways to address them here.

GUI Workaround

As is, the only GUI-based way to handle parts with encoded filenames in the client are to save the full enclosing email in a mail save file (list window: select, and Save); edit the mail's text in its save file to rename the attachment file name in its mail header line; and then reopen the mail from its save file (list window: Open save file, select in popup file list window, and View). This works, but obviously isn't a very user-friendly procedure.

Coding Fix

To do better, it might be simple to augment the partName method in Chapter 13's MailParser class to route the raw filename fetched from headers, to the decodeHeader header-text decoding method already present in this class. This would apply the required email, MIME, and Unicode decodings to such filenames, and yield a decoded Unicode filename string. Since the result might use a character set that doesn't work on the underlying platform's filesystem, though, this code would also need to try to encode it per the local platform's filesystem encoding type, and come up with a different name if the encode fails; it could follow the same naming pattern used for attachments that don't have a filename- or name-header present ("partNNN.xxx").

For the adventurous, the MailParser class, used by PyMailGUI and other clients, appears on page 976; its partName method on page 978; and the required decodeHeader method on page 980. The platform's filesystem encoding is described elsewhere in the book. It's too late to add this enhancement in the book, of course, but changing this would make a nice exercise in code maintenance (or see the next section, unless you want to try this on your own!).

Update: Partial Fix Details

[Aug-15-11] To help you get started, it appears that the first part of the coding fix, decoding i18n filenames (but not also testing for their correctness on the receiving platform), is simply a matter of changing the very last line of the partName method in the MailParser class from the first of the following to the second:

        return (filename, contype)

        return (self.decodeHeader(filename), contype) # aug 2011: decode fname

At least per minimal testing so far, this does the trick—Chinese and Russian encoded attachment filenames are properly decoded, just like other encoded text in primary email headers. These decoded filenames also happen to work unchanged on the Windows operating system (per UTF-8) in their decoded forms, though they may not work on some platforms, and their original i18n undecoded string forms always fail as filenames on Windows too and generate error popups in the GUI.

For instance, a Russian email's JPEG image attachment whose i18n filename was given in its part headers with a non-ASCII character set ("KOI8-R"), base64 translation ("B"), and the encoded text ("8M/..."):

Content-Type: image/jpeg; name="=?KOI8-R?B?8M/Xz9LP1CBJTUdfMTQxMi5KUEc=?="
Content-Disposition: attachment; filename="=?KOI8-R?B?8M/Xz9LP1CBJTUdfMTQxMi5KUA==?=
	=?KOI8-R?B?Rw==?="
Content-Transfer-Encoding: base64
with the patch is now correctly decoded to filename: Поворот IMG_1412.JPG. The original encoded filename text in the headers does not work as is. Similarly, a Chinese spam email attachment's headers using UTF-8, base64 encoded values:
Content-Type: application/vnd.ms-excel;
	name="=?utf-8?B?6YeR54mM6ZSA5ZSuMuWkqeS4gOWknOWunuaImOiuree7gy54bHM=?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
	filename="=?utf-8?B?6YeR54mM6ZSA5ZSuMuWkqeS4gOWknOWunuaImOiuree7gy54bHM=?="
now decodes to filename: 金牌销售2天一夜实战训练.xls (which will look right in this web page if your browser understands UTF-8 text). With this patch, these decoded filenames now appear correctly in the GUI's part buttons and its Parts list pop-up, and when saved to files and opened in other applications. Non-encoded filenames pass though the decoder unchanged as expected, so this should not break any cases that worked previously.

Because this is a simple change that doesn't alter line numbers, and because encoded filenames are becoming more common, I may mark this as a future reprints update, but be sure to patch your code copy this way if it does not include the enhancement and you start seeing encoded filenames in emails which you care to view.

TBD: Filesystem Limitations, Encoding for Sends Too?

Also keep in mind that the prior section's fix is partial: it still does not address filenames which won't work on the underlying platform's file system and may need to be renamed. Perhaps just as glaring, the fix handles decoding for display of fetched emails only; there is no special logic for encoding non-ASCII attachment filenames per i18n standards for sends—something which might have to be very similarly addressed near the end of the MailSender class's addAttachments method by passing filenames through that class's encodeHeader method, which modifies non-ASCII text only. As is, on sends non-ASCII attachment filenames cause smtplib to raise ASCII encoding exceptions for the full mail text in which the filename is embedded. Given that such encoding is not supported by Python's email package automatically, the change may be as simple as this—on page 964:

            basename = os.path.basename(filename)

            basename = self.encodeHeader(os.path.basename(filename))   # aug 2011
This should suffice: filenames that encode as ASCII are left intact and others are encoded per the UTF-8 default, though it would ideally also apply the mailconfig module's headersEncodeTo setting if present. But I'll leave this in the to-be-tested column; I don't send such emails, and doing so seems a bit dubious in any event on the portability grounds mentioned above. In other words, there is still plenty of room for improvement. Enhance as desired.

The Standard Futurisms Caveat

Ideally, Python's email package would decode filename header contents automatically and provide an alternative call for the rare cases when the raw text is required, and encode filenames for sends automatically in non-ASCII cases. Unfortunately, a future version of the email package may either decode and encode filenames this way or not, so it's impossible to predict the optimal resolution to this in future Pythons (as stated in the book, this was a primary reason some email issues were not addressed as completely as they might have been). As usual for software dependent on external libraries, be sure to watch for changes on this front.

Update, Oct-19-11 This issue was patched and fixed in full for both sends and receives in release 1.3 of the book examples package, and is no longer present in any later PyMailGUIs. See the release description which includes screenshots of the fix, and the formal patch description below. This fix was small enough that it will appear in the book itself in a future reprint. Note that because the fix was applied in common mailtools package code, it is inherited by PyMailGUI, as well as the less functional PyMailCGI webmail example in Chapter 16, though for the latter you may need to set your browser's encoding to UTF-8 to view the non-ASCII filenames embedded in the HTML reply stream.

[Back to Index]


Using SMTP Email Servers with Logins and Ports (Patched in 1.3)

Short story: some ISP's email servers allow or require use of ports with password-validated access and SSL/TLS secure transfers. Port numbers are automatically supported in the book's PyMailGUI, and illustrated here; user-configurable SSL/TLS support was added in the standalone release.

The Issue

[Jul-29-11] As stated in the book, you may need to change your SMTP mail server configurations to use the book's email clients to send email directly. This is especially true when you try to send mail by SMTP on public networks. Some of this is ISP-specific, but here are a few pointers.

This recently resurfaced for me because my email sends stopped working after my broadband provider changed their network. In short, they locked down the standard SMTP send port 25 in order to prevent spam, but this prevented direct sends using the server configurations I had been using. In the mailconfig.py file used by the PyMailGUI example, my SMTP server configurations were originally the following—a simple non-authenticating server at one of my third-party ISPs (Godaddy), which worked fine on my home network and some others:

smtpuser       = None                        # per your ISP, None = no login
smtppasswdfile = ''                          # if login, set to '' to be asked
smtpservername = 'smtpout.secureserver.net'  # if port 25 open for SMTP on network
This no longer worked after the SMTP port lockdown. To send emails, I had to change my SMTP server details to use an authenticating SMTP server at another third-party ISP, running on a non-standard port number (this server runs on EarthLink, which is different from my broadband provider):
smtpuser       = 'lutz@rmi.net'                         # login, authenticate
smtppasswdfile = ''                                     # ask for password in GUI
smtpservername = 'smtpauth.hosting.earthlink.net:587'   # 3rd party ISP, port 587
When so configured, I have to login to my email account at this ISP on the first email send in a session (PyMailGUI pops up a prompt for password input), but Python's smtplib module automatically parses off and converses over the custom SMTP port number included at the end of the server-name string. This all just works in both PyMailGUI and smtplib, with no code changes required.

With these authenticated SMTP settings, email sends seem to work on many more networks than before, and I avoid having to resort to webmail with all its annoying advertising. As an alternative, I could have routed email sends through the broadband provider's SMTP server directly, but this scheme can sometimes be tagged as spam, and may require a direct connection to the provider's network which may not always be possible:

smtpuser       = None                     # per your ISP
smtppasswdfile = ''                       # set to '' to be asked
smtpservername = 'mail.mailmt.com'        # your isp or local network's server
See your ISP for more on your server settings, and the comments in mailconfig.py and the book for more on client configuration settings.

Postscript: I recently had to change servers yet again to use a broadband provider account when EarthLink's SMTP server started timing out—the following works in all contexts now for me, but your mileage will certainly vary (and possibly, very often!):

smtpuser       = 'myloginname'            # nonblank=authenticated
smtppasswdfile = ''                       # ''=ask in GUI once
smtpservername = 'smtp.comcast.net:587'

Update, Oct-19-11 Per the release description above, there is now a demo of the ideas presented here in release 1.3 (and later) of the book examples package. The changes were too large to add to the book itself.

Update, 2018 There is also more complete support for SSL/TLS secure email servers in PyMailGUI's later standalone release. Such servers run on dedicated ports, use password-based authentication, and are now commonly required by many ISPs. See the latest PyMailGUI's change log and mailconfig files for more information. The sharp-eyed linear reader may also notice that this issue was touched on as an unrelated idea in this note elsewhere on this page (alas, long-running journals tend towards redundancy).

[Back to Index]


Explicitly Close Temporary File in PyMailGUI (Patched in 1.2)

Short story: the original version of PyMailGUI in the book omitted a file close call, which on some machines caused HTML viewers to misbehave. This was repaired in version 1.2 of the book's examples, in the text of the book itself, and in all later PyMailGUIs.

The Issue

[Jan-10-11] I recently spotted something unusual in Chapter 14's PyMailGUI, in module ListWindows.py, method PyMailCommon.contViewFmt (after about a year, you get to be your own code reviewer). This method's latter part opens an HTML-only email in a web browser after displaying its extracted plain text in a PyEdit frame, and has worked well for my email ever since it was coded. To be robust and explicit, though, it should probably run a tmp.close() after its tmp.write(asbytes), to ensure that the output file's buffers are flushed to disk before the browser opens it.

This isn't a bug and the code works as is, but apparently only because of fortunate timing: the browser started by the webbrowser module doesn't get around to opening the temporary file until well after the PyMailGUI method exits—which deletes its local variables, thereby reclaiming the temporary file object and automatically closing and hence flushing it in the process. This works, but seems too implicit in retrospect, and may not be the best coding pattern to emulate in any event. In general, you should run an output file's close or flush method (or use the with statement or unbuffered open modes) to flush output buffers to disk if you expect to be able to read the file in the same program.

See Chapter 4 for much more on output file closes (including rules of thumb which the book failed to heed in this context), especially pages 137-141 and 145. The prior edition strung the open and write calls together: open(tempname, 'w').write(content), which, though still somewhat implicit, made the automatic close of the temporary file on collection more apparent and immediate, and not as dependent on timing.

Later Findings: Some Machines Require Explicit Close

After seeing this issue manifest itself for very small HTML-only emails on a much faster machine running a different operating system, I'm reclassifying this as a correction to be patched in reprints and next example package version (1.2); please see the patch below. When and where this problem occurs, an HTML-only email may open as a blank web page, because its temporary file has not been flushed to disk. Oddly, on the faster machine, the web browser somehow opens and reads the temporary file before the GUI method has a chance to close it on exit. Moreover, this happens only for very small emails (< 10K), suggesting a buffer size role. Adding an explicit close() ensures proper behavior for all platforms and emails.

This issue is platform- and timing-specific; impacts just one minor aspect of a very large program; and, even when it does, can always be worked around in the user interface in three different ways—by pressing the web browser's refresh/reload, by viewing the extracted plain-text in the GUI's main window, and by clicking the sole HTML's part button in the main GUI window to open it on demand. Still, it's annoying and simple enough to merit a patch.

For detail-minded readers, the faster machine with the incorrect behavior was a multicore Windows Vista machine; the book test machine where the code works as is was a single-core Atom Windows 7 machine (a beefy netbook). Since the Python method in question returns immediately after this code, on Vista a web browser apparently starts and opens a local web page faster than a Python method function can exit. This may reflect differences in the implementations of start commands in the two systems. I'd be surprised if this occurred on other platforms which spawn processes to open browsers; on the other hand, unpredictable timing has a way of being unpredictable.

As described in the book, there are other issues related handling of HTML-only emails which I've left up to interested readers to address. As is, in lieu of an HTML-enabled text viewer, display of such emails employs a somewhat temporary scheme which was tested and used by just a single user on a single platform. For example, its single temporary-file model means only the last such email viewed is ever stored at any one point in time, regardless of how many are opened. Such oddness does not occur for HTML parts opened on demand, because their files are resaved and closed correctly in mailtools each time an open is requested. As for much of this system, improve as desired.

Update, Oct-19-11 Per the release description above, this issue was patched and fixed in release 1.2 of the examples package, and the patch is present in all later PyMailGUI releases. This patch was also applied in reprints of the book itself. See the formal book patch description ahead.

[Back to Index]


Using Timeouts for POP and SMTP Server Connections (Patched in 1.2)

Short story: PyMailGUI must use timeouts on server connections to avoid hanging while waiting for interminable transfers. This support was added in early versions of the book's examples package and in the book's text itself, and is present in standalone PyMailGUI. This support has, however, been the scene of multiple revisions and Python changes, and makes for a convoluted and evolving story.

The Issue

[Feb-1-11] The email-based code in Chapters 13 (and its clients in Chapters 14 and 16) should probably pass timeout arguments to the poplib and smtplib connection calls, to avoid waiting indefinitely. I've recently seen the book's PyMailGUI desktop client get stuck waiting for a poplib connection call that never returns, for example. This is rare, but without a timeout argument, your only recourse is to wait seemingly forever, or kill and restart the GUI. See Python's library manuals for more about passing timeout arguments in these modules.

Update: Timeouts Patch for Next Reprint

[Feb-22-11] I've now added the timeout arguments as patches to be made in the next reprint and examples package version (1.2): see the patch below. This became a priority when I started seeing the email server at my ISP suddenly failing to respond to connect requests on a very regular basis. This may have been a temporary problem at the ISP, but killing and restarting the email client's process manually was much more painful than patching to use timeouts for server connections.

Update: More on Timeout Settings

[Mar-17-11] If you wish to use PyMailGUI for real email work, you may need to increase its email transfer timeout values to accommodate slower Internet access speeds. As patched and shipped in the 1.2 examples package, both POP and SMTP timeouts are set to 20 seconds, which suffices in most cases. Especially for sending large emails over slow connections or slow servers, though, a higher SMTP timeout value such as 60 or more seconds may be required. See the patch ahead for the location of this value in the example code.

Discussion: This is a surprisingly subtle issue, not covered in the book. POP and SMTP timeout values were added after the book's publication to avoid transfer threads hanging indefinitely—a scenario which may require manually killing the email client's process in worst cases. For instance, failure to contact the server on mail Load requests can render the GUI largely inoperative (loads preclude most other operations, including Quit). This is despite the fact that the load is run in a thread; the GUI itself remains active, but cannot perform most server-based email processing in this state.

Passing timeouts when connected to email servers avoids this by triggering exceptions when server transactions hang, thereby terminating the server transfer thread and producing an error pop-up in the GUI. However, Python's library currently applies these timeouts naively to every server interaction step performed on the socket created for an email transfer—not just the initial connection calls, but also later data sends and receives. This makes the timeout setting sensitive to the speeds of email servers, the speed of your own Internet connection on the client, and message size in general. This is true for both fetches and sends (POP and SMTP), though it is more crucial for sends, which transfer a message's text all at once, than for fetches, which read messages line by line.

Unfortunately, this seems a bit of a Catch-22—larger timeout values allow larger mails to be transferred on slow connections, but also mean that email transfer threads will be hung longer when servers are truly inactive. Moreover, the timeouts can't generally be omitted altogether, or the transfer threads may hang interminably—as mentioned, in worst cases this may block other user operations in the GUI, and require the email client's process to be killed manually when email servers become unresponsive (this was the original motivation for the timeout patch). Really, Python's email library modules should probably support different timeout settings for different operations; one for all doesn't quite make sense—sending a large email requires very different timeout treatment than initial connection—but that's the API that exists today.

Although it's possible to implement more custom transfer timeouts manually instead of relying on the existing library module support (e.g., add top-level timer code around initial connect calls only), this would require too many post-publication code changes, and would not suffice if later operations hang. In principle, it's also possible to selectively enable and disable timeouts for the socket embedded in and used by the smtplib object (e.g., disabling them with server.sock.settimeout(None) for sends), but this is less than ideal from a software perspective, as it makes smtplib module clients too dependent on the module's internals: the module's API is intended to encapsulate and hide the underlying socket object used. This would also fail to address servers which become unresponsive during sends.

For sends, timeouts also trigger what looks like a bug in Python 3.1's standard library—the mail send appears to fail after the mail's text has been fully sent and while trying to read the server's truncated reply to it, but only because the socket sendall() call 4 levels down from the book's code seems to simply stop sending data and truncate it when the timeout expires, without correctly raising an exception to signal this error (on a Windows Vista client, talking to my ISP's server, at least). This leads later to a confusing SMTPServerDisconnected exception with the error text "Connection unexpectedly closed" reported in the GUI and the console, even though the true cause was the earlier data send's timeout. (This is complex, and there is neither time nor space to explore it fully here; for full fidelity, see the Python source code that raises this error, including its socket module's C code). Increasing the timeout allows sendall() to finish sending the data without truncation, and is required in some contexts in any event, but it is also effectively a workaround for this erroneous behavior in Python itself.

Summary: Because of all this, in one of my own contexts I had to bump up the SMTP timeout to 60 seconds to allow for sending a large email on a slow network; otherwise, the send failed with an exception and an error pop-up. Change likewise if and as needed for your context. Ideally, the mail server timeouts would be configurable in the mailconfig module instead of in executable source code files, but they were added after publication when larger-scale changes in book code listings were no longer possible (examples in books are ultimately meant to be demonstrative, and don't enjoy, and may not even warrant, the level of update flexibility common to software at large). Also ideally, Python's email library modules' APIs would support different timeout settings for different operations as mentioned above, but this will have to await a core developer's attention. For better and worse, software dependencies make your software, well, dependent.

Update, Oct-19-11 Per the release description, this was eventually patched and fixed in release 1.2 of the examples package, as well as in reprints of the book itself. See the formal patch description ahead for changes made; timeout values are passed to POP and SMTP server objects at their connect calls. Naturally, all later PyMailGUIs inherit the fix.

Update, 2018 This story changed yet again with 2015's Python 3.5. In brief, timeouts in Python 3.5's socket module's sendall() now apply to the entire call, not to each individual transfer it runs, and thus must be set higher in general. Because the smtplib module uses this call, this impacts email sends. For details, see the change descriptions for both Python and PyMailGUI. The latter modified its timeouts settings in its standalone release and made them user-configurable values (in its editable mailconfig files).

[Back to Index]


PyEdit: Minor Usability Tweaks and Suggestions (1 Patched in 1.2)

[Jan-9-11] Chapter 11's PyEdit text editor is a large program with lots of functionality, which I use nearly every day in one context or another. Besides text edits, it's also a key component in the book's PyMailGUI email client. Still, like most non-trivial user interfaces, there are plenty of ways its interaction might be customized or improved, depending on its users' preferences. After using it recently with the more critical eye afforded by the passage of time, five items seem prime candidates for improvement; all are minor nits and not bugs, but would be simple to improve, and might make nice coding exercises:

Respond to Enter in some dialogs

It might be nice if some of the nonmodal popup dialogs (Change, Grep, and so on) would respond to Enter key presses too, instead of requiring button clicks for activation as they do now. This should be easy to add, by binding a keyboard event on these dialogs; see earlier in the GUI part's chapters and examples for pointers on bind method events.

Show CWD in Grep dialog

In the new Grep dialog, a threaded and Unicode-aware file and directory searcher, the root directory is preset to ".", meaning the current working directory (CWD). In retrospect, this might be better set to the full os.getcwd() path, as it's not always obvious where the program was started ("." may not mean much if you don't know which program opened the editor interface). The downside here is that the absolute path might be a bit long for this dialog to display well. Of course, allowing a full regular expression pattern for the search key would be nicer too, but this potential upgrade is noted in the book already.

Disabling Unicode prompt popups

As shipped, the textConfig module configures PyEdit to always ask for a Unicode encoding on Opens and Saves, when the encoding type is unknown (and prefills with the platform default as a suggestion). It was shipped this way because of PyEdit's role in Chapter 14's PyMailGUI, where the Unicode encoding of some text items in email may be unknown; the best the system can do is ask. Still, for most standalone use cases, it's a bit of a bother to have to OK the Unicode encoding prompt dialog for every Open and new Save, when the platform default will apply most (or all) of the time. To disable the Unicode encoding prompt popups and apply the platform's encoding default to your text files, simply change the values of variables opensAskUser and savesAskUser from True to False in file textConfig.py. See that file's comments and the book for details, and this related note on encoding defaults on standalone PyEdit's support page.

Text loses focus after Unicode prompt popups (patched in 1.2).

As coded, the Unicode prompt popups issued on Opens and Saves may cause focus to be lost: you may need to click once in the text area before you can start typing text or navigating with arrow keys, though not if you next use scrollbars or view. This is a minor annoyance at worst, but seems to stem from a buglet in the standard dialogs used for the popups (they should ideally save and restore focus, as other standard dialogs do). It's also a non-issue if you disable these Unicode popups altogether per the prior note's suggestion. To force the issue for all cases, though, it's trivial to add a call to self.text.focus() just after the dialog calls in the Open and Save callback handler methods; see the Find and Goto dialog handlers for more hints. This is assuming that focus is a desired feature, of course; in some cases, users might proceed to scrollbars instead of arrow keys after an Open, and view rather than edit. (Feb-1-11: this was patched in book reprints and the book example package release 1.2; see the patch details below.)

Clone might be better in the File menu

The Clone action, which creates a new, independent edit window, is currently located in the Tools menu. To some, it might seem more typical and prominent in the File menu instead. Moving it to File has a subtle downside, though: because the File menu is removed (or disabled) when PyEdit is used in attached component mode, Clone would be unavailable. For instance, in Chapter 14's PyMailGUI email client, the PyEdit component has a Clone in Tools which currently works fine—it creates a new PyEdit in a new Toplevel window, with the Frame-based menus used in component mode. This can be useful for saving bits of text cut from the mail message, and so on. Clone works the same in Chapter 11's PyView, though its role there seems less compelling. Still, moving Clone implies tradeoffs.

Because PyEdit is a book example, written and provided in easily scriptable Python code, except as noted I'll leave applying these items as suggested exercises for readers, along with anything else you might care to tweak. This is Python, after all.

Update, Oct-19-11 Per the release description above, one of the items listed here (focus loss) was patched in release 1.2 of the examples package, as well as in reprints of the book itself. See the formal patch description ahead.

Update, Feb-2018 PyEdit has undergone major development in recent years, which addressed usability issues and limitations far in excess of those listed here. Its full set of post-publication changes is documented here. To get the latest PyEdit release with all the new enhancements, visit its web page. It's available as standalone executables for Mac, Windows, and Linux, plus a source-code package that's suggested follow-up study (if far too large to include in a book). While you're browsing, see also the standalone release of PyMailGUI and other programs, which have similarly evolved since their book appearances.

[Back to Index]


Shelve close() Calls Not Required for Non-Update Scripts

[Feb-1-11] A reader posted an errata report for this book on O'Reilly's site, which claimed that a db.close() call is required to avoid file corruption at the end of a Chapter 1 script that displays but does not update a shelve. This report was about Example 1-19, but pertains to many others in the book. Per the report, the shelve file triggers errors later, after it is updated by the next script and then displayed again. Here is the post's text:

Type: Minor technical mistake
Description: There are no page numbers when using the Kindle version of 
the text.

In the discussion of building dictionaries using classes, Example 1-19 
(dump_db_classes.py) requires closing the db at the end of the script.  
The final line of the code example should be:

db.close()

If the db is not closed, the subsequent updating (Example 1-20) and then 
re-printing of the db (using dump_db_classes.py again) will fail, giving 
an error code:

Traceback (most recent call last):
  File "C:\Python31\dump_db_classes.py", line 6, in 
    print(key, '=>\n', db[key].name, db[key].pay)
  File "C:\Python31\lib\shelve.py", line 113, in __getitem__
    value = Unpickler(f).load()
EOFError

I have not been able to reproduce this error, and suspect that the poster's kindle cut-and-paste simply dropped the close() call that appears at the end of the update script in Example 1-20. In my testing, the scripts in question work without error and as shown, both in Python 3.1 (using the "dumb" DBM file interface default in 3.X), as well as in Python 2.7 (using the bundled bsddb file interface default in 2.X). Moreover, these examples worked fine under Python 2.5 for the prior edition of this book, and similar scripts have been used successfully in earlier editions dating back to 1995. In general, shelves should not require close() calls unless the shelve has been updated. By this rule, a close() is not required in Example 1-19 (though it wouldn't hurt if added); however, all shelve update scripts in the book do call close() before exiting as required.

Because I can't reproduce this issue, I'm posting this as a clarification instead of a correction for now, barring a more detailed reader report. If you do see the same error, please email me with full context; the platform, Python version, and underlying DBM interface in use may factor into behavior too, and there are too many possible combinations for me to test exhaustively. In the poster's defense, shelves are notoriously error-prone; it's not impossible that file paths may have differed between scripts (the current working directory can sometimes change unexpectedly in IDLE), or that a bad cut-and-paste dropped the close() call in either the creation script of Example 1-18 or the update script of Example 1-20.

Update Feb-2011 The reader who posted this note was later unable to reproduce the error in question by repeating the steps which led to it, but believes it existed initially. That may close the case, though there are too many variables related to shelves to be certain.

[Back to Index]


General Book Marketing Question Replies

[Jan-10-11] O'Reilly Media (the book's publisher) recently asked me for written replies on a few questions related to this book's scope, audience, and goals. Since this might help both current and prospective readers understand the book in general, I've cut and paste the replies on this page.

[Back to Index]


tkinter: Linux/Mac Portability and Other Notes

[Nov-15-15] Recent development on the Frigcal calendar GUI has underscored a number of portability issues and other usage notes regarding Python's tkinter GUI library module used heavily in the book.

Linux Portability

In sum, for programs run on Linux:

As the book notes, its many tkinter examples were developed and demonstrated on Windows 7, and may require minor adjustments including the above for optimal use on Linux systems. The vast majority of tkinter does work across platforms with no code changes, but top-level window manager interfaces in particular are notoriously nonportable.

Other tkinter Notes

In addition to Linux use, the Frigcal program unearthed a number of noteworthy updates for tkinter in general:

For pointers on any of the above, see Frigcal, and its release history and code, and search the web for additional resources.

Update, Nov-2016 Per later porting, Mac OS X also poses a nontrivial set of Tk portability issues of its own; see this note elsewhere on this page for more details and Mac resource links.

[Back to Index]


tkinter after() Callbacks Can "Hang" on Time Changes

[Nov-14-15] Python's tkinter GUI library module, used extensively in the book and its examples, works largely as expected, and well enough to power tools like email clients and calendars that I use on a daily basis. Recently, though, I noticed that on rare occasions the book's PyMailGUI email client GUI can appear to hang, and not update its display for email server transactions run in threads. This most often manifests itself as non-modal message transfer "Busy" dialogs that are never erased.

After much puzzling, it was eventually determined that this occurs only when the system time has been changed—for instance, on daylight savings time adjustments, or other automatic or manual changes. And it turns out that this reflects a known issue in the Tcl/Tk library underlying Python's tkinter, which underlies PyMailGUI. The issue may be known, but not widely so, and underscores the tradeoffs inherent in software stack reliance.

In short, the widget.after() method arranges for an event to occur after a fixed duration, by scheduling it to occur at an absolute time in the future, based upon the current system time. If the system time changes, the event may never fire—or, more accurately, it will be postponed until its absolute scheduled time is eventually reached with respect to the new system time. To quote a Tcl/Tk document:

Tcl depends on the system time (converted to seconds from the start of the Unix 
epoch) increasing fairly close to monotonically for the correct behaviour of a 
number of things, but most particularly anything in the after command. 
Internally, after computes the absolute time that an event should happen and 
only triggers things once that time is reached, so that things being triggered 
early (which can happen because of various OS events) don't cause problems. 
If you set the system time back a long way, Tcl will wait until the absolute 
time is reached anyway, which will look a lot like a hang.

In PyMailGUI, this looks like a hang, but it's not. Really, the thread-queue checker's widget.after() is postponed indefinitely, such that finished and queued email server transactions are never noticed in the main GUI thread. The GUI itself remains active, and the email and thread logic works as planned; the widget.after() timer loop simply stalls due to the clock change, and no longer pops and dispatches callbacks that have been added to the thread queue. Hence, the email operations succeed, but the GUI isn't updated.

There is no known fix to this. If your PyMailGUI appears to not be processing email transactions, simply restore your system clock, or restart PyMailGUI. There is a good chance that a future Tcl/Tk release may address this by using a "monotonic" clock: one that has no reference point but can never go backwards, and so isn't susceptible to such change. In fact, Python now has such an interface—see time.monotonic(), new in 3.3 and always available as of 3.5. A Tcl/Tk fix on this front would be inherited by both tkinter and PyMailGUI, as well as all other GUIs with timer loops based on after().

[Back to Index]


You Must Use Your Own FTP and Email Servers/Accounts

[Nov-14-15] I've gotten numerous emails from readers who have trouble using some of the book's FTP and email Internet examples as coded in the book. In short, you must change the configuration settings in these scripts to use your own FTP server and email accounts. The CGI server-side scripting examples require no such change, as they run entirely locally—including their web server. I thought this was stated clearly in the book, but it's come up often enough to warrant pasting a reader query and reply on the subject here.

> -----Original Message-----
> From: ....
> To: lutz@learning-python.com
> Subject: chapter 13 example
> Date: Fri, 13 Nov 2015 16:35:41 -0800
> 
> I have a question on the getone.py example.  When I run it the ftp.rmi. 
> net doesn't seem to work. I see you have changed  to the learning python 
> site but when I use the new site the same problem occurs with the 
> password input. I've tried (), anonymous, and my email address. ? what 
> to do to connect.
>      I've been reading your books now for about 13mths. I'm 
> approaching this like learning a foreign language and enjoying it.  I 
> haven't moved to 3.5 as the 3.4 v seems to work well. Thanks for any 
> help you can give. 
>

Thanks for your note.  Unfortunately, I do not maintain an
FTP account for use by readers of the book.  With many thousands
of readers around the world, this would be too large a task and
security risk.  Instead, the assumption is that you will replace 
the site and login details with those of servers to which you 
have access, if you wish to run such code live.

This also applies to the later email (and other) examples: please
change configuration files to use your own account, as described 
in the book.  The server-side examples are more immune to this,
as they use a Python-coded server running locally on your machine. 

Best wishes on Python and the book,
--Mark Lutz, http://learning-python.com

[Back to Index]


Book Corrections

In a book this large there are bound to be a significant number of typos, and I don't plan on listing them all here. Instead, this section will collect those typos that at latest update time seemed most grievous to me on purely subjective grounds (of course, your subjective grounds may vary). The items here reflect patches made to the book itself in reprints; code patches too large for the book appear in the example package only. See O'Reilly's web page and its formal errata list for this book for the full list of typos collected and patched in reprints over time.



  1. Page xxviii, line 3 from page top: two typos in same sentence
    This text's "larger and more compete example" should be "larger and more complete examples".



  2. Page 678 in Chapter 11, line 3 of last paragraph on page, figure description off
    The text misstates Figure 11-4's content here: it does not show a Clone window (the original version of this screenshot did, but was retaken very late in the project to show Grep dialogs with different Unicode encodings). To fix, change this line's "a window and its clone" to read "a main window".



  3. Page 702 and 704, PyEdit: add text.focus() calls after askstring() Unicode popups
    For convenience, and per the detailed description above, we should add a call to reset focus back to the text widget after the Unicode encoding prompt popups which may be issued on Open and Save/SaveAs requests (depending on textConfig settings). As is, the code works, but requires the user to click in the text area if they wish to resume editing it immediately after the Unicode popup is dismissed; this standard popup itself should probably restore focus, but does not. To fix, add focus calls in two places. First, on page 702, at code line 21 at roughly mid page, change:
                if askuser:
                    try:
                        text = open(file, 'r', encoding=askuser).read()
    
    to the following, adding the new first line (the rest of this code is unchanged):
                self.text.focus() # else must click
                if askuser:
                    try:
                        text = open(file, 'r', encoding=askuser).read()
    
    Second, on page 704, at code line 8 near top of page, similarly change:
                if askuser:
                    try:
                        text.encode(askuser)
    
    to the following, again just adding the new first line:
                self.text.focus() # else must click
                if askuser:
                    try:
                        text.encode(askuser)
    
    Reprints: please let me know if there is not enough space for the inserts; I'd rather avoid altering page breaks in the process. This patch will also be applied to future versions of the book's examples package; in the package, the code in question is in file PP4E\Gui\TextEditor\textEditor.py, at lines 298 and 393.

    Update, Feb-24-11 Patched in version 1.2 of the book examples package (PP4E-Examples-1.2.zip).



  4. Page 963 line 9, and page 970 line 4: add timeout arguments to email server connect calls
    For robustness, and per the detailed description above, add "timeout=15" arguments to the POP and SMTP connect calls, so that email clients don't hang when email servers fail to respond. In the book, change code line 9 on page 963 from the first of the following to the second:
            server = smtplib.SMTP(self.smtpServerName)           # this may fail too
            server = smtplib.SMTP(self.smtpServerName, timeout=15)  # this may fail too
    
    Similarly, change code line 4 on page 970 from the first of the following to the second:
            server = poplib.POP3(self.popServer)
            server = poplib.POP3(self.popServer, timeout=15)
    
    In the book examples package, these changes would be applied to line 153 of mailSender.py, and line 34 of file mailFetcher.py, both of which reside in directory PP4E\Internet\Email\mailtools. They'll be patched in a future examples package version.

    Update, Feb-24-11 Patched in version 1.2 of the book examples package (PP4E-Examples-1.2.zip). I made the timeout 20 seconds in the examples package, to allow for slower email servers; 15 is more than enough to detect a problem with mine, but tweak this as desired.

    Update, Mar-17-11 Because timeout settings are used for every server interaction step, you may need to use a bigger timeout value (e.g., 60 or more seconds) in some contexts, especially when sending a large email over a slow client-side connection. See the update at the detailed description above for more on this.



  5. Page 1072, code line 10 from top of page, PyMailGUI: add a close() for HTML mail files
    For portability, and per the detailed description above, we should add an explicit close() call to flush the temporary file of an HTML-only email before starting a web browser to view it, so that this code works in all contexts. As is, it works on the test platform used for the book, and likely works on most others, because the method in question exits and thus reclaims, closes, and flushes the file before the spawned web browser gets around to reading it. However, this is timing and platform dependent, and may fail on some machines that start browsers more quickly; it's been seen to fail on a fast Vista machine. To fix in the book, change the middle line of the following three current code lines:
                            tmp = open(tempname, 'wb')      # already encoded
                            tmp.write(asbytes)
                            webbrowser.open_new('file://' + tempname)
    
    to read as follows, adding the text that starts with the semicolon (I'm combining statements to avoid altering page breaks):
                            tmp = open(tempname, 'wb')      # already encoded
                            tmp.write(asbytes); tmp.close() # flush output now
                            webbrowser.open_new('file://' + tempname)
    
    In the book's examples package, this code is located at line 209 in file ListWindows.py at PP4E\Internet\Email\PyMailGUI; it will be patched there too in a future examples package release (version 1.2, date TBD).

    Update, Feb-24-11 Patched in version 1.2 of the book examples package (PP4E-Examples-1.2.zip).



  6. Page 1226, two filename typos in same sidebar
    This will probably be obvious to most readers who inspect the external example files referenced here, but in this sidebar: "test-cgiu-uploads-bug*" should read "test-cgi-uploads-bug*", and the bullet item text "test-cgi-uploads-bug.html/py saves the input stream" should read "test-cgi-uploads-bug2.html/py saves the input stream".



  7. Page 1555, top of page, quotes are misplaced in heading line
    A typo inherited from the prior edition: the quotes and question mark in the heading line at the very top of this page are slightly off. Change the heading line: So What's "Python: The Sequel"? to read as: "So What's Python?": The Sequel. Quotes are angled in the original and revision. This header refers back to the sidebar in the Preface titled "So What's Python?". Arguably trivial, as this sidebar was 1500 pages (and perhaps a few months) ago by this point in the book, but it would be better to get this right. This header was broken by a copyedit change on the prior edition, and fell through the cracks on this one.



  8. Pages 978 and 964, encode and decode i18n attachment filenames for display, save, send
    Per the detailed description above, the following two changes will support both receipt and send of encoded i18n attachment filenames, assuming that such non-ASCII filenames are valid on the underlying platform (Windows is very liberal in this regard). First, on page 978, change the very last line of the partName method def statement from the first of these to the second (this is mid page at code line 26, in file mailParser.py at PP4E\Internet\Email\mailtools):
            return (filename, contype)
    
            return (self.decodeHeader(filename), contype) # oct 2011: decode i18n fnames
    
    Second, on page 964, change the 5th and 4th last lines of the addAttachments method def statement from the first of these to the second (this is mid page line -22, in file mailSender.py at PP4E\Internet\Email\mailtools):
                # set filename and attach to container
                basename = os.path.basename(filename)
    
                # set filename (ascii or utf8/mime encoded) and attach to container
                basename = self.encodeHeader(os.path.basename(filename))   # oct 2011
    
    Update, Oct-19-11 Patched in version 1.3 of the book examples package (PP4E-Examples-1.3.zip), and scheduled to be applied in future reprints of the book itself.

[Back to Index]



[Home] Books Programs Blog Python Author Training Search Email ©M.Lutz