Frankenthon!   

Python Changes 2014+


Introduction: Why This Page?

Major Changes in Python 3.6 (Dec 2016?)

  1. Yet another string formatting scheme: f'...'
  2. Yet another string formatting scheme: i'...'?
  3. Windows launcher hokey pokey: defaults
  4. Tk 8.6 comes to Mac OS X Python?
  5. Coding underscores in numbers
  6. Etcetera: the parade marches on

Major Changes in Python 3.5 (Sep 2015)

  1. Matrix multiplication: "@"
  2. Bytes string formatting: "%"
  3. Unpacking "*" generalizations
  4. Type hints standardization [section]
  5. Coroutines: "async" and "await" [section]
  6. Faster directory scans with "os.scandir()"?
  7. Dropping ".pyo" bytecode files
  8. Windows install path changes
  9. tkinter and Tk 8.6: PNGs, dialogs, colors
  10. Tk 8.6 regression: PyEdit Grep crashes
  11. File loads seem faster (TBD)
  12. Docs broken on Windows, incomplete
  13. Socket sendall() and smtplib: timeouts
  14. Windows installer drops support for XP

Major Changes in Python 3.4 (Mar 2014)

  1. New package installation model: pip
  2. Unpacking "*" generalizations? (to 3.5)
  3. Enumerated type as a library module
  4. Import-related library module changes
  5. More Windows launcher changes, caveats
  6. Etc: statistics, file descriptors, asyncio,...
 

Last revised: August 06, 2017 (restyled Oct-2017)

This page documents and critiques changes made to Python in 2014 and later, after the release of the book Learning Python, 5th Edition. As that book was updated for Pythons 3.3 and 2.7, and the 2.X line is effectively frozen, this page covers Pythons 3.4 and later. Earlier changes are documented in the book, but for brief summaries see its Appendix C, or this site's pages for Python 3.3, 3.2, and 2.7. About this page's author: here and here.

Introduction: Why This Page?

The 5th Edition of Learning Python published in mid-2013 has been updated to be current with Pythons 3.3 and 2.7. Especially given its language foundations tutorial role, this book should address the needs of all Python 3.X and 2.X newcomers for many years to come.

Nevertheless, the inevitable parade of changes that seems inherent in open source projects continues unabated in each new Python release. Many such changes are trivial—and often optional—extensions which will likely see limited use, and may be safely ignored by newcomers until they become familiar with fundamentals that span all Pythons.

The Downside of Change

But not all changes are so benign; in fact, parades can be downright annoying when they disrupt your day. Those downstream from developer cabals have legitimate concerns. To some, many recent Python extensions seem features in search of use cases—new features considered clever by their advocates, but which have little clear relevance to real-world Python programs, and complicate the language unnecessarily. To others, recent Python changes are just plain rude—mutations that break working code with no more justification than personal preference or ego.

This is a substantial downside of Python's dynamic, community-driven development model, which is most glaring to those on the leading edge of new releases, and which the book addresses head-on, especially in its introduction and conclusion (Chapters 1 and 41). As told in the book, apart from the lucky few who are able to stick with a single version for all time, Python extensions and changes have a massive impact on the language's users and ecosystem. They must be:

While the language is still usable for many a task, Python's rapid evolution adds additional management work to programmers' already-full plates, and often without clear cause.

Perhaps worst of all, newcomers face the full force of accumulated flux and growth in the latest and greatest release at the time of their induction. Today, the syllabus for new learners includes two disparate lines, with incompatibilities even among the releases of a single line; multiple programming paradigms, with tools advanced enough to challenge experts; and a torrent of feature redundancy, with 4 or 5 ways to achieve some goals—all fruits of Python's shifting story thus far.

In short, Python's constant change has created a software Tower of Babel, in which the very meaning of the language varies per release. This leaves its users with an ongoing task: even after you've mastered the language, new Python mutations become required reading for you if they show up in code you encounter or use, and can become a factor whenever you upgrade to Python versions in which they appear.

Consequently, this page briefly chronicles changes that appeared in Python after the 5th Edition's June 2013 release, as a sort of virtual appendix to the book. Hopefully, this and other resources named here will help readers follow the route of Python change—wherever the parade may march next.

An Editorial Note Upfront

Because changing a tool used by many comes with accountability, this page also critiques while documenting. Its assessments are grounded in technical merit, but many are also subjective and to some readers may not seem nice. The critiques are fair, however, and reflect the perspective of someone who has watched Python evolve and been one of its foremost proponents since 1992, and who still wishes the best for its future.

That said, this domain is innately controversial, and you'll have to weigh for yourself the potential benefits of each language change against the expansion of Python's knowledge requirements which it implies. In the end, though, we can probably all agree that critical thinking on this front is in Python's best interest. The line between thrashing and evolution may be subjective, but drawing it carefully is as crucial to the language's future as any shiny new feature can be.

Wherever you may stand on a given item below, this much is certain: a bloated system that is in a perpetual state of change will eventually be of more interest to its changers than its prospective users. If this page encourages its readers to think more deeply about such things while learning more about Python, it will have discharged its role in full.

Major Changes in Python 3.6 (December 2016)

[See the top of this page for an index to items in this section.]

Update: as of May 2016, Python 3.6 is now in alpha release, with a mid-December 2016 target for its final version, and still-emerging documentation here and here. The list of changes below is being updated as 3.6 solidifies and time allows.

At this writing in October 2015, Python 3.5 is less than one month old, yet a PEP for Python 3.6 is already in production with both a schedule and an initial changes list. So far, just one significant change is accepted—a proposal to add a fourth and painfully redundant string formatting scheme—but others will surely follow. It's impossible to predict what the final release will entail, of course, so please watch the PEP or this spot for more details as the 3.6 story unfolds.

1. Yet another string formatting scheme: f'...'

Python 3.6 plans to add a fourth string formatting scheme, using new f'...' string literal syntax. Provisionally know as f-strings, this extension will perform string interpolation—replacing variables named in expressions nested in the new literal with their runtime values. Technically, the nested expressions may be arbitrarily (and perhaps overly) complex; are enclosed in "{}" braces with an optional format specifier following a ":"; and are evaluated where the literal occurs in code. For instance:

f'we get {spam} alot.'                  # uses variable 'spam' in this scope

f'size of items: {len(items)}'          # ditto, but 'items' and an expression

f'result = {intvalue:#06x} in hex'      # formatting syntax is allowed here too

f'a manual dict: { {k:v for (k, v) in (('a', 1), ('b', 2))} }'       # hmm...
You can read more about this new scheme at its PEP. It will be provided in addition to existing formatting tools, yielding a set of four with broadly overlapping scopes:

As usual, this new scheme is imagined to be simpler than those that preceded it, and is justified in part on grounds of similar tools in other programming languages—arguments so common to each new formatting tool that reading the new proposal's PEP is prone to elicit strong déjà vu.

In a tool as widely used as Python, neither special case nor personal preference should suffice to justify redundant extension. This proposal almost completely duplicates existing functionality, especially for Python programmers who know about vars()—a built-in which allows variables to be named by dictionary key in both the original formatting expression and the later formatting method, and suffices for the vast majority of interpolation-style use cases:

>>> name = 'Sue'
>>> age  = 53                                            # keys/values in vars()
>>> jobs = ['dev', 'mgr']                                

>>> '%(name)s is %(age)s and does %(jobs)s' % vars()     # expression: original
"Sue is 53 and does ['dev', 'mgr']"

>>> '{name} is {age} and does {jobs}'.format(**vars())   # method: later addition
"Sue is 53 and does ['dev', 'mgr']"

Though rationalized on grounds of other languages and obscure use cases, in truth the new f'...' scheme simply provides roughly equivalent functionality with roughly equivalent complexity, and is largely just another minor variation on the theme. Moreover, this proposal seems a red herring in general: realistic programs build data in larger structures—not individual variables—and are unlikely to rely on direct variable substitution in the first place. F-strings are at most a redundant solution for limited roles and artificial use cases.

Worst of all, the net effect of this proposal is to saddle Python users with four formatting techniques, when just one would suffice. The new approach adds more heft to the language without clear cause; increases the language's learning requirements for newcomers; and expands the size of the knowledge base needed to reuse or maintain 3.X code—even if you don't use it, you can't prevent others from doing so. Frankly, the str.format() method was already redundant; adding yet another alternative seems to be crossing over into the realm of the reckless and ridiculous.

If you're of like opinion, this page's author suggests registering a complaint with Python's core developers before 3.6 gels too fully to make this a moot point. The pace of change in the 3.X line need be only as absurd as its users allow.

2. Yet another string formatting scheme: i'...'?

Actually, the prior item's story gets worse. Since the f-string note above was written, a new Python 3.6 PEP has been hatched to add yet another special-case string form—the i-string, described as "general purpose string interpolation", and coded with a leading "i" (e.g., i"Message with {data}"); which is almost like the already accepted f-string above, new in 3.6 and coded with a leading "f" (e.g., f"Message with {data}"); but not exactly.

This page won't lend credence to this proposal by covering it further here; please see its PEP for details—and be sure to note its C#-based justification. It follows the sadly now-established Python tradition of bloating the language with new syntax for limited use cases which could be easily addressed by existing tools and a modest amount of programming knowledge. In this case, Python 3.6 is already expanding on itself in utero, sprouting new special-case tool atop new special-case tool.

This PEP is not yet accepted for 3.6 (and may not make the cut in the end), but if ever made official will bring the string formatting options count to a spectacularly redundant five. One might mark this up to a bad April Fools' Day joke, but it's still February...

3. Windows launcher hokey pokey: defaults

Per its early documentation, Python 3.6 will change its "py" Windows launcher to default to an installed Python 3.X instead of a 2.X when no specific version is specified, in some contexts. For background and discussion on the change, see here and here. Here's how 3.6's What's New describes the change at this writing:

The py.exe launcher, when used interactively, no longer prefers Python 2
over Python 3 when the user doesn’t specify a version (via command line 
arguments or a config file).  Handling of shebang lines remains unchanged
- “python” refers to Python 2 in that case.

The launcher's prior policy of defaulting to 2.X—in place since 2012's Python 3.3—made little sense, given that the launcher was shipped with 3.X only. As this author pointed out 4 years ago in this article (and later in the book), users installing a Python 3.X almost certainly expected it to be the default version used by a launcher that comes with the 3.X install. Choosing 2.X meant that, by default, many 3.X scripts would fail immediately after a 3.X was installed. The remedy of setting an environment variable (or other) to force 3.X to be selected was less than ideal, and arguably no better than the case with no launcher at all.

The new 3.6 behavior improves on this in principle. Unfortunately, though, it seems both too little and too late—this is a backward-incompatible change that will complicate matters by imposing launcher defaults that vary per 3.X version. Worse, the default policy is unchanged for "#!" (a.k.a. "shebang") lines that name no specific version, leaving users with three rules to remember instead of one:

  1. In Python 3.3 through 3.5, non-"#!" version-agnostic launches prefer a 2.X
  2. In Python 3.6 and later, non-"#!" version-agnostic launches prefer a 3.X
  3. In all Python 3.X, "#!" version-agnostic launches prefer a 2.X

The former single rule—always preferring a 2.X if installed—may have been subpar, but it was certainly simpler to remember, and has become an expected norm widely used for the last 4 years. The 3.6 change's net result is to complicate the story for 3.X users; triple the work of 3.X documenters; and frustrate others tasked with supporting Python program launches on Windows across the 3.X line.

There was a time when convolution was an explicit anti-goal in the Python world. Alas, the methodology of perpetual change in Python 3.X today seems something more akin to a development hokey pokey (insert audio clip here).

4. Tk 8.6 comes to Mac OS X Python?

The Mac OS X version of Python from python.org may finally support version 8.6 of the TK GUI library used by Python's tkinter module. This is welcome news, given the confused and tenuous state of tkinter on that platform in recent years. As it stands today, tkinter programs largely work on Macs, but require careful installation of an older Tk 8.5 from a commercially-oriented vendor, and exhibit minor but unfortunate defects that don't exist on Windows or Linux and can be addressed only by heroic workarounds when addressable at all—not exactly ideal for Python's standard portable GUI toolkit.

Hopefully, a Tk 8.6 port will address these concerns. With any luck, the Python 3.6 installer will also include Tk 8.6 on the Mac as it does on Windows, to finally resolve most version jumble issues. There is also a rumor that Tk 8.7 will support the full UTF-16 range of Unicode characters—including those beyond Tk's current UCS-2 BMP range—but this is a story for 2017 (or later) to tell. For now, Tk requires data sanitization if non-BMP characters may be present (scroll down to PyEdit's About emojis notebox for a prime example).

Update: the Tk 8.6 update didn't happen—python.org's Mac Python 3.6 still links to and requires Tk 8.5, unfortunately. Maybe in 3.7? Till then, Homebrew Python might be an option for accessing more recent Tks... except that Homebrew Python+Tk is currently broken as this update is being written in June 2017. The Mac could really use more attention from open-source projects, especially given the increasingly-dark agendas in Windows.

5. Coding underscores in numbers

As a minor but useful extension, numeric literals in 3.6 will allow embedded underscores, to clarify digits grouping. For example, a "9999999999" can now be coded as "9_999_999_999" to help the reader parse the magnitude. Clever, perhaps, though use of this extension will naturally make your code incompatible with—and unable to run on—any other version of Python released in the last quarter century.

You can read more about this change at its PEP. While you're there, be sure to note its prior-art rationale—yet another case of "other languages do it too" reasoning. This tired argument is both non-sequitur and bunk; other languages also have brace-delimited blocks, private declarations, common blocks, goto statements, and other oddments that Python seems highly unlikely to incorporate.

Wherever you might weigh-in on this particular change, Python developers do seem to prefer following to leading today much more often than they should. Why the rush to make all languages the same? Python is Python, a tool which has been sufficiently interesting by itself to attract a broad user base; stop morphing it to mimic other tools!

6. Etcetera: the parade marches on

Regrettably, it now appears that Python 3.6 will go even further down the rabbit holes of asynchronous programming and type declarations begun in Python 3.5, with new obscure syntax for both asynchronous generators and comprehensions, and generalized variable type declarations. You can read about these changes in 3.6's What's New document, and their associated PEPs named there.

To be blunt: these explicitly-provisional extensions have the feel of student research projects, with terse documentation, absurd complexity, and highly limited audience. This is now the norm in 3.X—each new release sprouts wildly arcane "features" that reflect the whims of an inner circle of core developers, but make the language less approachable to everyone else. This pattern has grown tedious, and this writer is disinclined to document its latest byproduct cruft any further than their earlier 3.5 notes on this page here and here.

Instead, this page will close its 3.6 coverage by simply reiterating that Python 3.X is still both remarkably useful and arguably fun, if programmers are shrewd enough to stick with its large subset that does not add needless intellectual baggage to the software development task. As chronicled here, the leading edge of Python now sadly entails so much thrashing that its role as a rational basis for software projects is fairly debatable. Indeed, today's Python ironically stands charged with the same unwarranted complexity that its advocates once criticized in other tools.

But you don't have to use the new stuff. As always, keep it simple—both for yourself, and for others who will have to understand your programs in the future. In the end, your code's readability is still yours to decide.

As for Python's own tortured evolution, though... Frankenthon Lives!

Major Changes in Python 3.5 (September 2015)

[See the top of this page for an index to items in this section.]

Update: Python 3.5 is now officially released, and all the items previewed in this section wound up being added as described. The tense here should probably be changed from future to present, but the past is the past...

Python's next release, version 3.5, has been scheduled for mid-September 2015. It's shaping up to be a major set of language extensions—some of which are not backward compatible even within the 3.X line, and many of which cater to a narrow audience. This note is a work in progress and its most recent update reflects 3.5's beta preview releases as of August 2015, so take it with the usual grain of salt.

The official plans for 3.5 live here. In short, though, the major anticipated 3.5 language changes include the items in the following list. Among these, many are not without the usual controversy, most add to language heft, and three (#4, #5, and #7) break backward compatibility within the 3.X line itself.

1. Matrix multiplication operator: "@"

This Python will add a new "@" binary operator, which will perform matrix multiplication, formerly the realm of numeric libraries such as NumPy. This operator also comes with a "@=" augmented assignment form, and a new operator overloading method named "__matmul__" (along with the normal "r" and "i" method variants). By its detractors, the new "@" matrix multiplication has been called an overly niche tool that expands Python's complexity and learning curve needlessly, and may be too underpowered to be useful when applied to Python's native object types. You can read more about the proposal in its PEP.

Curiously, though, the "@" matrix multiplication operator won't be implemented by any built-in object types such as lists or tuples in 3.5. Instead, it is being added entirely for use by external, third-party libraries like NumPy. The latest revision of its PEP discusses this limitation, but here's the short story in 3.5.0 final:

C:\Code> py -3.5
>>> [1, 2] @ [3, 4]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for @: 'list' and 'list'

>>> [1, 2] @ 3
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for @: 'list' and 'int'

>>> [1, 2] * 3
[1, 2, 1, 2, 1, 2]     # this is still repetition, not multiplication
In other words, the Python core language has been extended with a new operator that is completely unused by the Python core language. There is nothing else quite like this in Python 3.X; it's as if syntax were being added solely for use by animation libraries or web toolkits which are not part of Python itself. While the new operator can be put to use for application-specific roles with operator overloading, it's pointless syntax and serves no purpose as shipped (except, perhaps, in unreasonably-cruel job interview questions).

Although numeric programming is clearly an important Python domain, the 3.5 "@" operator seems an excursion up the very slippery slope of application-specific language extensions—and leaves most Python users with an oddball expression in their language which means absolutely nothing.


Update: as a fine point for language lawyers, it can be argued that ellipsis ("...") comes close in this department, both in heritage and pointlessness. That may be so in 2.X, but not in 3.X, the subject of this page. In Python 3.X, this term has been generalized to serve perfectly valid roles for all language users—as a placeholder object (like None) and a placeholder statement (like pass):

C:\Code> py -3
>>> tbdlist = [...] * 100
>>> def tbdfunc():
        ...
>>>
>>> tbdlist[-1]        # it's a placeholder object in 3.X
Ellipsis
>>> tbdfunc()          # it's a no-op statement in 3.X (see p390 in LP5E)
>>>
That is, ellipsis is no longer syntax used only by third-party libraries. Even so, this misses the whole point; one bad idea surely does not justify another!

2. Bytes string formatting: "%"

[This section was rewritten in full in July 2016]

Python 3.5 extends the "%" binary operator to perform text-string formatting for bytes objects—an operation formerly limited to str objects. It's suggested that this will aid migration of 2.X code, and in byte-oriented domains be a simpler alternative to existing tools such as concatenation and bytearray processing, or conversions to and from str.

On the other hand, extending "%" string formatting to bytes has been questioned on grounds of fundamental incompatibility of text and bytes in Python 3.X's type model:

Because of these core differences, bytes and str cannot be mixed in most Python 3.X operations. Extending text-oriented formatting to bytes in 3.5 can be fairly described as a break with this dichotomy, and a throwback to 2.X's very different and ASCII-focused string model. This extension's motivation also seems on shaky ground: grafting 2.X's string semantics onto 3.X in the name of 2.X porting ease is akin to adding "private" declarations to simplify the translation of C++ programs. The combination dilutes and muddles 3.X's own semantics.

Let's see what this means in terms of code. In brief, "%" formatting is defined for str (i.e., text) strings in all 3.X, but only for str prior to 3.5—which makes sense, given that text in bytes is still encoded per any Unicode encoding, and text characters in str objects may map to multiple bytes in memory both when decoded and encoded:

C:\Code> py -3.3
>>> 'a %s parrot' % 'dead'             # for str in all 3.X: decoded Unicode text
'a dead parrot'                        # but not for bytes: text encoding unknown 

>>> b'a %s parrot' % 'dead'
TypeError: unsupported operand type(s) for %: 'bytes' and 'str'

>>> b'a %s parrot' % b'dead'
TypeError: unsupported operand type(s) for %: 'bytes' and 'bytes'
In 3.5 and later, "%" works on bytes strings too, but its behavior and requirements are type-specific, and subtly different for bytes than for str. For example, the general "%s" formatting code is retained by bytes for porting 2.X code, but it's just a synonym for a "%b" bytes substitution which always expects a bytes string and inserts its bytes—whether they are ASCII-encoded text or other:
C:\Code> py -3.5
>>> b'a %s parrot' % b'dead'                  # new in 3.5: bytes, %s == %b
b'a dead parrot'

>>> b'a %s parrot' % bytes([0xFF, 0xFE])      # works for non-ASCII bytes too
b'a \xff\xfe parrot'

>>> b'a %s parrot' % 'dead'                   # but %s (%b) allows bytes only!
TypeError: %b requires bytes,... not 'str'

>>> b'a %s parrot' % 'dead'.encode('ascii')   # manually encode str to bytes
b'a dead parrot'
Really, "%" isn't defined for bytes at all through 3.4 regardless of types and conversion codes—"%" requires and runs method "__mod__", which is missing in bytes before 3.5 (Python 2.X has "%" for both bytes and str, but only because its bytes is its str; "%" is also supported for its unicode type which maps to 3.X's str):
C:\Code> py -3.3
>>> '__mod__' in dir(str), '__mod__' in dir(bytes)    # never works through 3.4     
(True, False)

C:\Code> py -3.5
>>> '__mod__' in dir(str), '__mod__' in dir(bytes)    # sometimes works in 3.5+
(True, True)
The 3.5 extension seems to build on the preexisting but arguably confusing rule that allows bytes objects to be made from a plain text string: as long as a bytes literal contains only ASCII characters inside its quotes, a bytes object is created with an implicit encoding of the characters to their ASCII byte values. Without this all-ASCII constraint, there would be no way to map characters to single byte values. Bytes is still just bytes (a sequence of 8-bit values), but allows ASCII text and converts it to bytes in this special-case context only:
C:\Code> py -3.5 
>>> b'a %s parrot' % 'dead'                # str characters don't map to bytes
TypeError: %b requires bytes,... not 'str'

>>> b'a %s parrot' % b'dead'               # but ASCII character bytes ok here?
b'a dead parrot'

>>> ('a %s parrot'.encode('ascii') %       # it's really doing this implicitly
...               'dead'.encode('ascii'))  # but ASCII seems too narrow in 3.X 
b'a dead parrot'

>>> b'a %b parrot' % bytes([0xFF])         # ditto for binary byte values (%b=%s)
b'a \xff parrot'

>>> 'a %b parrot'.encode('ascii') % bytes([0xFF])
b'a \xff parrot'
Surprisingly, numeric values can be inserted as either binary-value bytes or ASCII-encoded digit strings—the latter of which seems at odds with both bytes-based data and the much broader Unicode model of text. In the following, "%c" inserts a number's binary byte value from an int or 1-item bytes, but numeric codes like "%d" and "%X" expect a number and insert its ASCII digit string instead. In fact, numeric codes work for bytes as they do for str, but with an extra and implicit ASCII encoding for the result, as in the last example here:
C:\Code> py -3.5
>>> (b'a %c parrot' % 255), (b'a %c parrot' % b'\xFF')      # inserts byte values
(b'a \xff parrot', b'a \xff parrot')

>>> (b'a %d parrot' % 255), (b'a %d parrot' % b'\xFF'[0])   # inserts ASCII digits!
(b'a 255 parrot', b'a 255 parrot')

>>> (b'a %04X parrot' % 255), ('a %04X parrot' % 255).encode('ascii')   # ditto
(b'a 00FF parrot', b'a 00FF parrot')
While using ASCII for "%d" and "%X" may reflect some use cases, ASCII is an arbitrary choice in this context, and may be invalid for byte strings containing text encoded per other Unicode schemes. Bytes objects with UTF16-encoded text, for example, may require manual steps instead of ASCII-digits insertion. Still, this may be a moot point: it's impossible for the 3.5 bytes "%" operation to even recognize an embedded "%d" or any other format code unless the bytes object's content is ASCII-compatible in the first place:
C:\Code> py -3.5
>>> 'a %d parrot'.encode('ascii') % 255     # only an ASCII "%<code>" works!
b'a 255 parrot'

>>> 'a %d parrot'.encode('utf8') % 255      # utf8 is compatible; utf16 is not!
b'a 255 parrot'

>>> 'a %d parrot'.encode('utf16') % 255
ValueError: unsupported format character ' ' (0x0) at index 7
Because bytes don't carry information about text encoding, there is no way to detect any substitution format code such as "%d" unless it is in ASCII form. Hence: 3.5's bytes string "%" formatting works only for bytes objects containing ASCII-compatible text. This is where the extension seems to break down in full: in 3.X's Unicode world, encoded text must always be qualified with an encoding type, and ASCII is far too narrow an assumption. Trying to emulate 2.X's ASCII constraints in 3.X doesn't quite work, and leaves us with a semantic black hole:
C:\Code> py -3.5 
>>> 'a %b parrot'.encode('latin1') % b'dead'     # ASCII-compatible text only!
b'a dead parrot'

>>> 'a %b parrot'.encode('utf16') % b'dead'
ValueError: unsupported format character ' ' (0x0) at index 7

>>> 'a %d parrot'.encode('utf16')
b'\xff\xfea\x00 \x00%\x00d\x00 \x00p\x00a\x00r\x00r\x00o\x00t\x00'
In fact, this extension's all-ASCII and 2.X-like assumptions can yield nonsensical results when applied in the context of 3.X's more general Unicode text paradigm. In the first part the following, the ASCII-format-code and numeric-digit-insertion rules conspire to cause ASCII-encoded text to be inserted in UTF16-encoded text; in the second part, we wind up with UTF16 in ASCII, both implicitly and explicitly—the former of which seems especially error-prone, and all of which underscores the problems inherent in processing still-encoded text as text:
C:\Code> py -3.5 
>>> s = ('a '.encode('utf16') + b'%d' + ' parrot'.encode('utf16')) % 255
>>> s
b'\xff\xfea\x00 \x00255\xff\xfe \x00p\x00a\x00r\x00r\x00o\x00t\x00'

>>> s.decode('utf16')
UnicodeDecodeError: 'utf-16-le' codec can't decode byte 0x00 in position 24:...

>>> b'a %s parrot' % 'dead'.encode('utf16')
b'a \xff\xfed\x00e\x00a\x00d\x00 parrot'

>>> 'a %b parrot'.encode('ascii') % 'dead'.encode('utf16')
b'a \xff\xfed\x00e\x00a\x00d\x00 parrot'
In the process of stretching and weakening 3.X's string model, this extension also manages to yield new special-case rules that seem sure to trip up programmers. Among them are the same-type requirements shown earlier (bytes requires bytes), and the very different behavior of the "%s" string substitution code in bytes and str—it inserts byte values for bytes but a print string for str, making string formatting an operation whose meaning now varies per string type:
C:\Code> py -3.5 
>>> b'a %s parrot' % b'dead'          # %s inserts byte values for bytes
b'a dead parrot'

>>> 'a %s parrot' % b'dead'           # but a print string for str!
"a b'dead' parrot"

>>> b'a %s parrot' % bytes([0xFF])    # ditto for non-ASCII bytes
b'a \xff parrot'

>>> 'a %s parrot' % bytes([0xFF])     # % is now a type-specific operation!
"a b'\\xff' parrot"
In sum, 3.5's bytes string formatting has a strong ASCII orientation: it assumes ASCII in the subject bytes object's content; produces ASCII in digit strings for some conversion codes; and builds on the already-implicit ASCII encoding in bytes literals. To be sure, the only way text formatting can work for bytes at all is by limiting it to text encoded in trivial 8-bit schemes. The net result of this constraint, however, confuses 3.X's richer Unicode world with 2.X's ASCII-focused world, and adds new special-case rules in the bargain.

In all 3.X, text string formatting is intrinsically better suited to str objects—already-decoded Unicode text, whose original source encoding is no longer present, and whose content may include any characters in the Unicode universe. Formatting fails for bytes for the simple reason that its text is still encoded: there's no way to process encoded text correctly without allowing for its encoding, and restricting bytes to ASCII is dated, artificial, and extreme in Python's Unicode-aware line (run the following in IDLE if your Äs don't Ä):

C:\Code> py -3 
>>> spam = 'sp\xc4\u00c4\U000000c4m'   # text formatting is for text: decoded str
>>> spam                               # original Unicode encoding is irrelevant
'spÄÄÄm'
>>> 'ham, %s, and eggs' % spam
'ham, spÄÄÄm, and eggs'
 
>>> code = '%s'.encode('utf16')        # format codes: decoded Unicode text
>>> code                               # ASCII requirements don't apply to str
b'\xff\xfe%\x00s\x00'
>>> ('ham, ' + code.decode('utf16') + ', and eggs') % spam
'ham, spÄÄÄm, and eggs'

>>> 'Ä %d parrot' % 255                # digits: Unicode characters (code points)
'Ä 255 parrot'                         # not ASCII-encoded text: this is 3.X!
>>> 'Ä %04X parrot' % 255
'Ä 00FF parrot'
In the end, text code points are not bytes, and encoded text is not text; treating these as the same works only in a limited ASCII-based world, which no longer exists either in Python 3.X or the software field at large:
C:\Code> py -3 
>>> s = 'Ä %d parrot \U000003A3 ᛯ \u3494' % 255 
>>> s
'Ä 255 parrot Σ ᛯ 㒔'

>>> s.encode('utf8')  
b'\xc3\x84 255 parrot \xce\xa3 \xe1\x9b\xaf \xe3\x92\x94'
And if you're still unconvinced (and readers new to Unicode may be), consider this: even in the very rare cases where bytes formatting might be useful, all that this extension really saves is two essentially no-op method calls to decode from and encode to ASCII around a str formatting step—hardly a justification for its paradigm splitting:
C:\Code> py -3.5 
>>> b = b'the %s side of %04X'      # with the extension: ASCII implicit
>>> b % (b'bright', 255)
b'the bright side of 00FF'

>>> s = b.decode('ascii')           # without the extension: ASCII explicit
>>> s = s % ('bright', 255)         # just decode + use str % + encode
>>> s.encode('ascii')               # and this form works in all 3.X!
b'the bright side of 00FF'
Or simply use simpler tools: formatting is never strictly required, and part concatenation, substring replacement, and bytearray processing usually provide alternatives that—like the preceding example—make the ASCII assumption explicit (see EIBTI); do not complicate the language when existing tools suffice (see KISS); and are portable across all Python 3.X releases (see the last 8 years):
C:\Code> py -3 
>>> p1 = b'bright'                  # or KISS: these work in all 3.X too!
>>> p2 = '%04X' % 255

>>> b'the ' + p1 + b' side of ' + p2.encode('ascii')
b'the bright side of 00FF'

>>> b'the $1 side of $2'.replace(b'$1', p1).replace(b'$2', p2.encode('ascii'))
b'the bright side of 00FF'

>>> b = bytearray(b'the  side of ')
>>> b[4:4] = p1
>>> b.extend(p2.encode('ascii'))
>>> b
bytearray(b'the bright side of 00FF')

See the 3.5 formatting change's PEP for the full story on its behavior and rationale which we'll cut short here. There may be valid use cases for binary data formatting (e.g., the PEP mentions byte-and-ASCII data streams like email and FTP), but it remains to be seen whether their prevalence justifies a change that blurs the text/binary dichotomy that is one of Python 3.X's hallmarks.

What is clear, though, is that this change comes with constraints and exceptions that seem complex enough to qualify as still-valid counter arguments—especially for an extension whose results can be easily produced with existing tools. Unfortunately, Python 3.X has a growing history of welcoming special-case solutions to tasks that could be solved with general programming techniques. While such solutions may appeal to a subset of Python's user base, they come at the expense of language learning curve at large.

3. Unpacking "*" generalizations

As covered in Learning Python, in Python 3.4 and earlier, the special *X and **X star syntax forms can appear in 3 places:

  1. In assignments, where a *X in the recipient collects unmatched items in a new list (3.X sequence assignments)
  2. In function headers, where the two forms collect unmatched positional and keyword arguments in a tuple and dict
  3. In function calls, where the two forms unpack iterables and dictionaries into individual items (arguments)

In Python 3.5, this star syntax will be generalized to also be usable within data structure literals—where it will unpack collections into individual items, much like its original use in function calls (#3 above). Specifically, the unpacking star syntax will be allowed to appear in the literals of lists, tuples, sets, and dictionaries, where it will unpack or "flatten" another object's contents in-place. For example, the following contexts will all unpack starred iterables or dictionaries:

[x, *iter]         # list:  unpack iter's items
(x, *iter, y)      # tuple: ditto (parenthesis or not)
{*iter, x}         # set:   ditto (values unordered and unique)
{x:y, **dict}      # dict:  unpack dict's keys/values (rightmost duplicate key wins)
These are in addition to the star's original 3 roles in assignments and function headers and calls. Here is the new behavior in Python 3.5 and later:
C:\code> py -3.5
>>> x, y = [1, 2], (3, 4)
>>> z = [*x, 0, *y, *x]                    # unpack iterables
>>> z
[1, 2, 0, 3, 4, 1, 2]

>>> m = {'a': 1}
>>> n = {'b': 2, **m}                      # unpack dictionary
>>> n
{'a': 1, 'b': 2}
   
>>> n = {'b': 2, **{'b': 3}, **{'b': 4}}   # rightmost duplicate key wins
>>> n
{'b': 4}
This change is imagined as a way of flattening structures that requires less code than traditional tools such as concatenation and method calls, and yields coding possibilities that some may consider clever. It remains to be seen, though, whether Python programmers perceive this as an academic curiosity with a still-limited and special-case scope, or adopt it as a broadly applicable tool.

As usual with such extensions, it's straightforward to achieve the same effects with other tools that have long been a standard part of the language, and are available to users of all recent Python versions. And also as usual, the new star syntax expands an already large set of redundancy in the language for the sake of stylistic preferences of a handful of proponents:

>>> x, y = [1, 2], (3, 4)
>>> z = x + [0] + list(y) + x              # unpack iterables -- without "*"
>>> z
[1, 2, 0, 3, 4, 1, 2]

>>> m = {'a': 1}
>>> n = {'b': 2}
>>> n.update(m)                            # unpack dictionary -- without '**'
>>> n
{'a': 1, 'b': 2}

>>> n = {'b': 2}
>>> n.update({'b': 3, 'b': 4})             # ditto
>>> n
{'b': 4}

The original proposal for this change also called for adding it to comprehensions:

[*iter for iter in x]     # unpacking in comprehensions: abandoned in 3.5
But this was dropped in 3.5 due to readability concerns (though this change in general may still raise an eyebrow or two). Moreover, the change's proposed relaxation of ordering rules in function calls was also abandoned in the end due to lack of support. It does, however allow for multiple star unpackings in function calls—syntax that is an error in 3.4 and earlier:
>>> print(1, *['spam'], *[4, 'U'], '!')
1 spam 4 U !
This proposal has been debated since 2008, was originally scheduled for Python 3.4 and later bumped to 3.5, and may yield more changes in the future. For more details, see Python 3.5's What's New document, or the change's PEP document.

On the upside, this is an extension to the core language itself, but not a change that is likely to break existing code. Still, its multiple-unpackings in function calls may have consequences for some function-processing tools. More fundamentally, this change overall seems to trade a minor bit of general code for obscure new syntax, in support of a very rare operation—a regrettably recurring theme in Python 3.X.

All opinions aside, such change inevitably sacrifices language simplicity for special-case tools. While the jury is still out on this change, its consequences for both beginners and veterans should be a primary concern. To put that another way: unless you're willing to try explaining a new feature to people learning the language, you just shouldn't do it. Tickets to this one being put to that test would be well worth their price.

4. Type hints standardization [ahead]

The Python language may also adopt a standard syntax for type declarations in 3.5, using—and limiting—function annotations. This is so potentially major and controversial a development that it merits its own section ahead.

5. Coroutines: "async" and "await" [ahead]

The Python language may also adopt coroutines with "async" and "await" syntax in 3.5, for just one concurrent coding paradigm of limited scope. Like the prior item, this change is sufficiently broad and contentious to warrant its own section ahead.

6. Faster directory scans with "os.scandir()"?

[Spoiler: this story has evolved. Per the updates ahead, the os.scandir() gain initially noted here is platform-specific, and is actually a loss on Macs. Even where os.scandir() helps, its speedup can be fully matched—and perhaps beaten—by simply using os.stat()/lstat() directly instead of os.path.*() calls. Given that both schemes require similar changes to os.path.*()-based code, the os.stat()/lstat() solution seems better.]

Though not a language change per se, there will be a new "os.scandir()" call in the standard library, which is reported to be substantially faster than the longstanding and still-supported "os.listdir()", and will speed Python's "os.walk()" directory walker client by proxy. In a nutshell, the new call replaces name lists with an object-based API that retains listing state, thereby eliminating some system calls for attributes such as type, size, and modtime. For example, the traditional way to process directories is by names:

#!/usr/bin/python3.5
import os, sys
dirname = sys.argv[1]                      # command-line arg

for name in os.listdir(dirname):           # use name strings
   path = os.path.join(dirname, name)      # type, name, path, size, modtime
   if os.path.isfile(path):
       print(name, path, os.path.getsize(path), os.path.getmtime(path))
The new alternative in 3.5 produces the same results, but may cache information gleaned from initial listing and other system calls on a result object to save time (caution: "is_file" in the following requires parenthesis—if used without them as though it were a property, you'll simply reference the method object which is always True!):
#!/usr/bin/python3.5
import os, sys
dirname = sys.argv[1]                      # command-line arg

for dirent in os.scandir(dirname):         # use dirent objects
   if dirent.is_file():                    # type, name, path, size, modtime
       stat = dirent.stat()
       print(dirent.name, dirent.path, stat.st_size, stat.st_mtime)
This change adds useful functionality, rather than deprecating any, and seems a clear win—it claims to make "os.walk()" 8 to 9 times faster on Windows, and 2 to 3 times quicker on POSIX systems. Recoding some "os.listdir()" calls to use "os.scandir()" directly as above can yield similar speed improvements. Given that directory walks and listings pervade many programs, this change's benefits may be widespread. See the new call's PEP, benchmarks, and documentation.


Update 1: As an example use case, testing shows that the comparison phase of the mergeall directory tree synchronizer runs 5 to 10 times faster on Windows 7 and 10 with "os.scandir()". The savings is especially significant for large archives—runtime for a 78G target use case's comparison of 50k files in 3k folders fell from 40 to 7 seconds on a fast USB stick (6x), and from 112 to 16 seconds on a slower stick (7x). Also note that the "scandir()" call is standard in the "os" module in 3.5, but it can also be had for older Python releases, including 2.7 and older 3.X, via a PyPI package; mergeall uses either form if present, and falls back on the original "os.listdir()" scheme as a last resort for older Pythons. All of which seems proof that language improvement and backward compatibility are not necessarily mutually exclusive.


Update 2: Or not!—per mergeall 3.0's Feb-2017 release notes, Python 3.5's os.scandir() does indeed run faster than os.listdir() on both Windows (5X to 10X) and Linux (2X), but runs 2 to 3 times slower on Mac OS X, as the call is used by the mergeall program:

/Admin-Mergeall/kingston-savagex256g/feb-2-17$ diff \
        noopt1--mergeall-date170202-time091326.txt \ 
        opt2--mergeall-date170202-time092217.txt 
0a1
> Using Python 3.5+ os.scandir() optimized variant.
4053c4054
< Phase runtime: 5.286043012980372
---
> Phase runtime: 10.12333482701797

Hence, this call is an anti-optimization on Macs, and should generally not be used there, subject to your code's usage patterns. Alas, one platform's improvement may be another's regression!


Update 3: A final twist: in support of symbolic links, the non-scandir() version of mergeall's comparison-phase code was ultimately changed to use os.lstat() and the stat objects it returns, instead of os.path.*() calls. As a side effect, this made it as fast or faster than the scandir() variant on Windows too. The non-scandir() variant remained 2X quicker on Macs, and in fact improved slightly. Here are the final numbers for mergeall 3.0's comparison phase run on a 60k-file archive:

Consequently, mergeall was able to drop the redundant and now-superfluous scandir()-based variant altogether, as it was both anti-optimization on Mac, and bested by stat-based code on Windows. This eliminated a major maintenance and testing overhead of prior releases.

In the end, scandir() now seems an extraneous tool. It can indeed speed programs that formerly used multiple os.path.*() calls on some platforms, but requires program changes no less extreme than os.stat()/lstat(). Moreover, it performs worse on Mac OS X, and elsewhere does no better and perhaps worse than programs coded to use stat objects directly. Given that programs must be changed or coded specially to use either scandir() or os.stat()/lstat(), the latter seems the more effective way to optimize cross-platform code.

The scandir() call's internal use by os.walk() seems its only remaining justification—though os.walk() also could have simply used stat objects to achieve the same performance gain. While this is now officially hindsight, augmenting the os.path.*() calls' documentation to note the stat-based alternative's speed boost may have been shrewder than adding a redundant tool with similar coding requirements, but uneven and platform-dependent benefit.

7. Dropping ".pyo" bytecode files

Also in the non-language department: Python 3.5 may abandon ".pyo" optimized bytecode files altogether, instead naming optimized bytecode files specially with an "opt-" tag, which makes the file similarly distinct from non-optimized bytecode. For instance, after starting Python and importing a "mymod.py", the "__pycache__" subfolder's bytecode file content now varies between Python 3.3 and 3.5 (use a command line like [py -3.5 -OO -c "import mymod"] to run and import in a single step):

mymod.cpython-33.pyc          # from "py -3.3"
mymod.cpython-33.pyo          # from "py -3.3 -O" and "py -3.3 -OO"

mymod.cpython-35.pyc          # from "py -3.5"
mymod.cpython-35.opt-1.pyc    # from "py -3.5 -O"
mymod.cpython-35.opt-2.pyc    # from "py -3.5 -OO"
Though this is a CPython implementation-level change, it will likely have major backward incompatibility consequences—".pyo" files have been around forever, and are almost certainly assumed and used by very many tools. Moreover, the change may seem largely cosmetic on a first-level analysis—trading a name extension for a name tag. See the PEP for full details.

8. Windows install path changes

On Windows, Python 3.5.0 by default now installs itself in the user's normally hidden AppData folder, with a unique folder name for 32-bit installs, instead of using the former and simpler scheme. The impacts of this path change are compounded by the fact that Python's install folder is not added to the user's PATH setting by default. Especially for novices, these changes make common tasks such as studying the standard library and running "python" command lines more difficult, and invalidate much existing beginners documentation. In practice, many Python users won't even be able to find Python 3.5 on their machine after installing it.

You (and your code) can inspect the install path with "sys.executable". A default install of Python 3.4 lives at the following simple path, the scheme used for all recent Python 3.X and 2.X installs (if you're checking this live, Python displays "\\" as an escape for a "\"):

C:\Python34\python.exe            # original, through 3.4
By contrast, a 3.5 default install—selected by an immediate "Install Now"—winds up in one of the following much longer and normally hidden paths, depending on whether you run the 64- or 32-bit installer:
C:\Users\yourname\AppData\Local\Programs\Python\Python35\python.exe      # 64-bit
C:\Users\yourname\AppData\Local\Programs\Python\Python35-32\python.exe   # 32-bit
Technically, this reflects the 3.5 installer's single-user default. It's possible to use custom install options to choose different schemes. For instance, an all-users install—selected by "Custom installation", "Optional Features", and "Install for all users" in "Advanced Options"—stores Python in a shorter and non-hidden path (minor update: in Python 3.5.1, the "Python 3.5" in the following may become "Python35" in a token nod to consistency):
C:\Program Files\Python 3.5\python.exe           # 64-bit, 32-bit on recent Windows
C:\Program Files (x86)\Python 3.5\python.exe     # 32-bit on some machines
It's also possible to select any install path in "Advanced Options" (including that used in 3.4 and earlier), but this is 3 or 4 levels deep in the installer's screens. Realistically, because single-user installs without PATH settings are the default, they are also likely to be the norm for most people new to Python—the very group that will struggle most with hidden folders and paths.

The default install path was reportedly changed for security reasons on multiple-user machines, though this rationale is likely irrelevant to the vast majority of Python users. The new path isn't a problem for launching Python if you always use either filename associations, the Start button's menu, or "py" executable command lines (e.g., "py", "py -3.5 script.py"); in these contexts, paths and PATH settings are not required. But this implies multiple and platform-dependent launching techniques to learn and use, when "python" and PATH settings are more generic. Moreover, the use of hidden folders seems hardly in the spirit of open source; users should be encouraged to view Python's own code, not hindered.

There seems no ideal remedy for this change. Beginners are probably best advised to either perform a custom all-users install with PATH setting enabled; or avoid "python" command lines, and change the folder view to show hidden files—and hence Python 3.5. See Python 3.5's Windows install docs for more details on the new installer's policies. As this change seems prone to invoke complaints, also watch for news on this front.


Update: The 32-bit Windows installer for the new 3.0.0 Pillow imaging library on Python 3.5 appears to be broken on 3.5.0 too, and the new Python installer and its new install paths seem prime suspects, though there are already bug reports regarding the installer's system settings (as well as posts on forums from confused users unable to find Python 3.5 after an install...).

9. tkinter and Tk 8.6: +PNGs and file dialogs, -color names

Python 3.5 supports Tk 8.6—the latest version of the GUI library underlying tkinter—and provides it automatically with the standard Windows installer at python.org. Tk 8.6 was first adopted this way in Python 3.4 so most of this note applies to that release too, though some installs of both Python 3.4 and 3.5 may use older Tks (e.g., Mac OS X); check your Tk version with "tkinter.TkVersion" after importing tkinter to see if this note applies to you. While largely compatible, Tk 8.6 has some noteworthy changes.

In the plus column, Tk 8.6's native PhotoImage object now supports PNG images in addition to GIF and PPM/PGM, making some installs of the Pillow (a.k.a. PIL) image library for Python unnecessary for programs that simply display images. On the other hand, Pillow does much more than display; Tk 8.6 still lacks JPEG and TIFF support; and PNGs without Pillow are naturally available only for users of Tk 8.6+ (e.g., Python 3.4+ on Windows). As an example, the latest release of the frigcal calendar GUI leverages this to display PNG month images in Pythons using Tk 8.6 and later.

Also an improvement, the version of Tk 8.6 used by the standard Windows install in Python 3.5 (but not 3.4) now uses true native file and folder dialogs on Windows. For example, the folder dialog has changed from Python 3.4 to Python 3.5. Basic file open and save dialogs on Windows have also morphed and gone more native in Python 3.5; run programs live for a look, and see the Tk change note for more details.

In the minus column, Tk 8.6 changes some color name meanings to conform to a Web standard, oddly abandoning those used for the last 25 years. This makes some color names render differently, and often too dark to be used as label backgrounds. Specifically, "green", "purple", "maroon", and "grey/gray" are now much darker before. You must use "medium purple" for the former "purple", "silver" for what was "gray", and "lime" to get the prior "green"—though "silver" and "lime" don't work in prior releases, making these colors now platform-specific settings! The best advice here: to make your code immune to such breakages and portable across Pythons, use hex "#RRGGBB" strings instead of names for colors in tkinter. For background details, see the Tk change proposal, and this Python issue tracker post.


Update: per a report from a frigcal user on Mac OS X, a Python version number does not always imply a Tk version number outside the standard Windows install. Specifically, the C language code of Python 3.5's tkinter module gives just this constraint:

    Only Tcl/Tk 8.4 and later are supported.  Older versions are not supported. 
    Use Python 3.4 or older if you cannot upgrade your Tcl/Tk libraries.
In the frigcal user's case, Python 3.5 on a Mac was using Tk 8.5, not 8.6. Hence, this note applies to Tk version, not necessarily Python version, and has been edited accordingly.

10. Tk 8.6 regression: PyEdit GUI Grep crashes in Python 3.5

[Preface: this note has evolved over 6 months from an initial description, to a first update that proposed a cause that proved irrelevant, and a final update that describes a workaround. In the end, it was necessary to replace threads with processes to sidestep the crash altogether. This is a long but representative tale of realistic programming in action; read it linearly for full dramatic effect...]

(Aug-2016) Now for the worse news on Tk 8.6. It appears that the version of the Tk GUI library shipped with the standard version of Python 3.5 (and perhaps 3.4) for Windows has sprouted a new and serious bug related to threading. This may or may not be fixed in future Tk versions shipped with future Pythons, but it can lead to pseudo-random crashes in formerly-working Python tkinter GUI code when run on standard Python 3.5 on Windows, and Pythons using Tk 8.6 anywhere.

This issue was observed on Windows in the "Grep" external file/folder search tool of the text-editor program PyEdit—a major example in the book PP4E. Specifically, this tool spawns a producer thread to collect matching files and lines and post them on a queue, while the main GUI thread watches for the result to appear on the queue in a timer loop. Per the usual coding rules, the producer thread does nothing GUI-related; all window construction and destruction happens in the main GUI thread only.

This tool worked without flaw since Python 3.1, but started experiencing random crashes under Python 3.5. When using 3.5 on Windows, the program's console records the following strange Tcl/Tk error message just before a hard crash that kills the entire PyEdit process, causing any unsaved file changes to be silently lost:

Tcl_AsyncDelete: async handler deleted by the wrong thread

The vast majority of PyEdit still works correctly on Python 3.5 and Tk 8.6, but "Grep" usage is prone to fail this way sporadically but eventually. This issue is still being explored, but here are the details so far:

Per the first bullet above, the crash occurs in the Tcl/Tk 8.6 library shipped with Python 3.5 for Windows, and used by Python on some other platforms. Python 3.4 uses this same Tk, but it's not yet known if the crash occurs in 3.4 too; if not, Python 3.5's Tk interface (or threading) code is suspect. Further findings will be posted here as they appear.

Unfortunately, there is no clear fix for this problem. The best advice for PyEdit users is to: avoid using "Grep" in Python 3.5 (which reduces utility); run PyEdit under Python 3.3 or earlier (which a "#!python3.3" line can force on Windows); use an older Tk (which is easier on some platforms than others); or hope that the issue will be repaired in a future Python and Tk. For the present, though, an apparent bug introduced in Tk may have impacted very many down-stream programs, products, and users.

All of which should also serve as a lesson to the prudent reader about the darker side of the "batteries included" paradigm of Python and open source in general. External code can be a great time-saver when it works, and it often does. But the more your program depends on such code, the more likely it is to be crippled by a regression in an underlying layer over which you have no control—a potential for disaster which is only compounded by the rapid change inherent in open source projects. Unless you are able to stick with older working versions for all time, you should be aware that software "batteries" can also be your project's weakest link.


Update 1, Sep-2016 (This update's proposal later proved irrelevant, as described ahead.) On further investigation, it appears that the Tk 8.6 thread crash described above may be triggered by the insertion of a pathologically-long line of text into a listbox. The crash is roughly reproducible for a folder having a "Grep" result line that is 423k characters long—a saved web page that is presumably trying to hide something:

>>> lens = []
>>> for line in open(r'the-crashing-folder\the-offending-file.html', encoding='utf8'):
...     lens.append(len(line))
...
>>> lens.sort()
>>> lens
[1, 1, 1, 16, 34, 41, 44, 54, 59, 79, 81, 82, 98, 100, 357, 1054, 1754, 8950, 423556]

Tk has historically had issues with long text lines. Per the new theory, the main GUI thread adds this absurdly-long line to the results listbox, which in turn somehow causes the bizarre Tk abort on most tries. In other words, this may not be a broad and general threading bug in Tk, because it may require a very specific trigger. The fact that other programs using the same queue-based threading code structure work correctly in Tk 8.6 adds support to this explanation (see PyMailGUI and mergeall for examples).

On the other hand, this crash is still a Tk regression, and is still related to Tk's threading model:

Further clarity on this crash will have to await Tk and/or Python developers. Regardless of that outcome, though, this remains a problem caused by upgrading to a new Python—a dire but common pattern, and an example of the tradeoffs inherent in reliance upon constantly-changing, community-developed software. In this case, the "batteries" gone bad are a fundamental Python GUI toolkit used by very many systems, which cannot be replaced without costly redevelopment. The net result for PyEdit is that a "Grep" in the latest Pythons is a use-at-your-own-risk proposition.


Update 2, Jan-2017 This crash was eventually isolated to be triggered by the fully non-GUI code of the grep's spawned file searcher, strongly suggesting that this is indeed a random thread bug in the underlying Python 3.5/Tk 8.6 combination, and may be best addressed by a non-threaded coding alternative of the sort described here.

The proposed long-line explanation in the first update above was addressed in code but proved irrelevant, because the crash was eventually observed to occur before the GUI's timer-loop polling consumer ever received the queued result. Moreover, recodings to both avoid uncaught exceptions in the thread and explicitly close input files in all cases also appeared to have no effect. The former should not impact the main GUI thread; the latter should have happened automatically in CPython, and should not trigger a hard and chaotic crash in either Python or Tk in any event.

Though evidence is still scant—as it's prone to be in a fiendishly random and brutally dormant crash—Python's threading module now seems a prime suspect, given that the simpler _thread module has been long used without any such issue in the PyMailGUI program. The higher-level threading module adds substantial administrative code for features unused by PyEdit's grep (and others), which could conceivably interact poorly with Tk/tkinter's event loop or threading.

In the end, a workaround was coded in PyEdit to completely sidestep the issue, by using the multiprocessing module's processes, instead of threading module's threads. This proved a workable solution, given the simple list of strings passed from the non-GUI producer to GUI consumer; unlike PyMailGUI, PyEdit's grep does not pass unpickleable objects, such as bound method callbacks, that require the full shared memory state of threads. Most importantly, by removing the threading variable in this bug's equation, multiprocessing removes the guesswork of other theories.

As an added bonus, multiprocessing also may run faster, because it allows grep tasks to better leverage the power of multicore CPUs. On one Windows test machine, each grep process receives its own 13% slice of the CPU, while grep threads receive just a portion of the single process's 13% allocation. The net effect is that N parallel greps can run roughly N times faster when they are processes. The story is similar on Mac OS X: processes can consume substantially more CPU time than threads, and finish noticeably faster.

Readers interested in the workaround can find it in the newly released version 2.2 of PyEdit, As explained at that page, PyEdit is currently shipped as part of the standalone PyMailGUI release. The crucial code of the fix that allows testing all three spawn options is in the following snippet. The thread crash was never observed outside Windows, but the multiprocessing module's portability to Windows, Linux, and Mac OS X makes this workaround a cross-platform solution.

The multiprocessing module's chief downside seems to be its implications for frozen executables, but that, sports fans, is a tale for another day.

class TextEditor:
    ...
    def onDoGrep(self, dirname, filenamepatt, grepkey, encoding):
        ...
        # start the non-GUI producer thread or process [2.2]
        spawnMode = configs.get('grepSpawnMode') or 'multiprocessing'
        grepargs = (filenamepatt, dirname, grepkey, encoding)

        if spawnMode == '_thread':
            # basic thread module (used in pymailgui with no crashes)
            myqueue = queue.Queue()
            grepargs += (myqueue,)
            _thread.start_new_thread(grepThreadProducer, grepargs)

        elif spawnMode == 'threading':
            # enhanced thread module (original coding: crashes?)
            myqueue = queue.Queue()
            grepargs += (myqueue,)
            threading.Thread(target=grepThreadProducer, args=grepargs).start()

        elif spawnMode == 'multiprocessing':
            # thread-like processes module (slower startup, faster overall?)
            myqueue = multiprocessing.Queue()
            grepargs += (myqueue,)
            multiprocessing.Process(target=grepThreadProducer, args=grepargs).start()
        else:
            assert False, 'bad grepSpawnMode setting'

        # start the GUI consumer polling loop
        self.grepThreadConsumer(grepkey, filenamepatt, encoding, myqueue, mypopup)

11. File loads appear to be faster (TBD)

When running the frigcal calendar GUI program, the time required to initially load calendar files has been substantially reduced in Python 3.5—from 5 seconds to 3 seconds in one use case, and from 13 to 8 in another. It's not yet clear where the improvement lies, as the load must both read ".ics" calendar files and parse and index their contents. Either way, 3.5 has clearly optimized a common task—another great example of how a programming language can be improved without breaking existing code.

12. The docs are broken on Windows and incomplete

The Python standard manuals included with the Python 3.5.0 Windows installer have issues when some items are selected. For example, on this author's most-used computer, the coroutine documentation in the What's New section yields this politically grievous failure—with an error message that complains about browsers (despite the fact that the Windows 7 host machine's default browser is the latest version of the popular and open-source Firefox), and seems to express a bias for Microsoft and Google (though the Microsoft link simply disses IE 6 which isn't even installed on the machine, and the Google link fails completely).

On top of that, the docs have JavaScript errors (which may or may not trigger the browser confusion), and don't fully integrate the new 3.5 coroutine changes (async, for instance, is absent in the function definitions section of compound statements). Glitches happen, of course, and these may be repaired. But between the bugs and the omissions, this really seems half-baked; if you're going to change things radically, you should at least document your work fully.

13. Socket sendall() and smtplib: timeouts

The sendall() call in Python's standard library socket module has subtly changed the interpretation of timeouts as of 3.5, as noted here. In short, the limit given by a timeout's value is now applied to a full sendall() operation, rather than to each individual data transfer executed during the operation. In some contexts this may mean that timeout values must be increased for the new semantics. As an example, because Python's own smtplib module, used for sending email, internally uses socket.sendall() to transfer the full contents of an email message, some email clients may require higher timeout values in 3.5 and later.

For more details on this 3.5 change, see the PyMailGUI email client's related change log entry. A change to behavior that has been in place for very many years should come with very strong rationales, especially when the change impacts existing code; you'll have to judge whether this one passes the test.

14. Windows installer drops support for XP

Beginning with 3.5, the Python Windows installer no longer supports Windows XP. If you want to use Python 3.X on an XP machine, you must use Python 3.4 or earlier. This is despite the very wide usage that XP still enjoys—including, reportedly, half the computers in China as of 2014. Per Python's PEP 11, Python will follow Microsoft's lead on the issue, and support only Windows versions that Microsoft still does. To quote the PEP: "A new [Python] feature release X.Y.0 will support all Windows releases whose [Microsoft] extended support phase is not yet expired". So much for open source being free from the whims of commercially-interested vendors...


To draw your own conclusions on these and other Python 3.5 changes, watch the What's New and emerging documentation. The next two sections give fuller treatment to two of the more controversial 3.5 extensions.

Why proposed type declarations in Python 3.5 are a bad idea

Python 3.5 may adopt a standard syntax for type "hints" (a.k.a. optional declarations), using 3.X function annotations. There is no new syntax in this proposal and no type checker per se—just a use case for existing 3.X annotations that precludes all others, and a new "typing" standard library module which provides a collection of metaclass-based type definitions for use by tools. The module would be introduced in Python 3.5, and annotation role limitations would be mandated over time. You can read about this change at its PEP. In brief, the proposal standardizes annotating function arguments and results with either core or generic types:

def spam(address: str) -> str:                 # core types
    return 'mailto:' + address

from typing import Iterable                    # new module's types
from functools import reduce

def product(vals: Iterable[int]) -> int:
    return reduce(lambda x, y: x * y, vals)

Interestingly, parts of this proposal are similar to this type-testing decorator, derived from a similar example in Learning Python, 5th Edition. As stated in that book, a major drawback of using annotations this way is that they then cannot be used for any other role. That is, although they are a general tool, annotations directly support just a single purpose per appearance, and hence will be usable only for type hints if this proposal's model becomes common in code. In contrast, a decorator-based solution would instead support multiple roles both by choice and nesting 1.

Python developers seem to be aware of this problem—in fact, they propose to deal with it by simply deprecating all other uses of annotations in the future. In other words, their proposed solution is to oddly constrain a formerly general tool to a single use case. The net effect is to introduce yet another incompatibility in the 3.X line; take away prior 3.X functionality that has been available for some 7 years; and rudely break existing 3.X code for the sake of a very subjective extension. Anyone who is using annotations for other purposes is out of luck; they will have to change their code in the future if it must ever run on newer Pythons—a task with potentially substantial costs created by people with no stake in others' projects.

This seems a regrettably common pattern in the 3.X line; its users must certainly be learning by now that it is a constantly moving target, whose evolution is shaped more by its few developers than its many users. For more on this decision, see this section of the PEP.

Thrashing (and rudeness) aside, the larger problem with this proposal's extensions is that many programmers will code them—either in imitation of other languages they already know, or in a misguided effort to prove that they are more clever than others. It happens. Over time, type declarations will start appearing commonly in examples on the web, and in Python's own standard library. This has the effect of making a supposedly optional feature a mandatory topic for every Python programmer. This is exactly what happened with other advanced tools that were once billed as optional, such as metaclasses, descriptors, super(), and decorators; they are now required reading for every Python learner.

In this case, the proliferation of type declarations threatens to break the flexibility provided by Python's dynamic typing—a core property and key benefit of the language. As stated often in the book, Python programming is about compatible interfaces, not type restrictions; by not constraining types, code applies to more contexts of use. Moreover, Python's run-time error checking already detects interface incompatibility, making manual type tests redundant and useless for most Python code. See the decorator mentioned earlier for concrete examples; manual type tests usually just duplicate work Python does automatically.

In Python, constraining types limits code applicability and adds needless complexity. Worse, it contradicts the very source of most of the flexibility in the language. This proposal will likely escalate these mistakes to best practice. We'd be left with a dynamically-typed language that strongly encourages its users to code type declarations, a combination that will seem paradox to some, and pointless to others.

If you care about keeping Python what it is, please express an opinion in the developers forums 2. If we let this one sneak in, anyone who must upgrade to a new Python 3.X release in the future will likely find themselves with a language whose learning curve and de facto practice are growing to be no simpler—and perhaps more complex—than those of more efficient alternatives like C and C++.

Let's stop doing that. Needless change doesn't make a language relevant; it makes it unusable (see Perl 6).


Footnotes:

1 For an example of existing practice that employs multiple-role decorators instead of single-use annotations for optional type declarations, see the Numba system. This system actually does something with the declarations—allowing numerically-oriented functions to be compiled to efficient machine code—without imposing the model on every Python user.

2 This change was adopted in 3.5 (which is hardly surprising, given its source), but a well-reasoned critique of deprecating other roles for annotations was posted on the python-dev list in October, 2015—read the thread here. This post was greeted with a curiously defensive us-versus-them dismissal, and a suggestion to raise the complaint again after people start complaining. Full points to readers who spot the logic problems there.

The 3.X sandbox saga continues: 3.5 coroutines with "async" and "await"

In support of the cooperatively-concurrent paradigm of coroutines, this large batch of changes, adopted fairly late in the 3.5 cycle, introduces new syntax for "def", "for", and "with"; an entirely new "await" expression; and two new reserved words, "async" and "await", that will be in a strange new soft keyword category and phased in over time.

This proposal is the latest installment in the volatile and scantly-read tale of Python generators, whose backstory is told here. Python has long supported interleaved execution of coroutines in both 2.X and 3.X with "yield" generator functions and task switchers, as demonstrated in simple terms by this code. Python 3.5 expands on this, and earlier 3.X additions, to convolute the model further with extensions that will be available only to code run on 3.5 and later.

In short, new syntax in "def" declares a native coroutine (as opposed to former generator-based coroutines), and the new "await" expression suspends execution of the coroutine function until the awaitable object completes and returns results (similar in spirit to the "yield from" available only in 3.3 and later). In code:

async def processrows(db):
    ...
    data = await db.fetch(querystring)            # suspend and wait
    ...

This can be used in conjunction with event loops, of the sort afforded by the asyncio module added recently in Python 3.4. Former generator-based coroutines coded with "yield" can interoperate with the new native coroutines coded with "async" and "await", by using a new "@types.coroutine" decorator.

And yes, you read that right: there are now two different and incompatible flavors of coroutines in Python 3.X—generator and native, with subtly divergent semantics. Moreover, coroutines themselves are a very old idea; come with well-known and inherent constraints on code (it generally cannot monopolize the CPU, as the multitasking model is nonpreemptive); and are just one of a variety of ways to avoid blocking states—along with tried-and-true options like threads, which many would consider more general, and which have no special syntax in Python.

Two additional syntax extensions—for "with" and "for"—are also part of this proposal. Presumably, they are the reason developers opted for new "def" syntax instead of a simpler built-in "async" function used as a decorator (though "C# has it" is also strangely given as an explicit basis in the PEP; more on this ahead). In the abstract:

async with expr as var:           # async context managers
    suite

async for target in iter:         # async iteration loops
    suite
else:
    suite2

In addition, there is a set of new "__a*__" special method names for use by classes that wish to tap into the new syntax's machinery, a draconian list of special cases for programmers to remember—and a whole lot more.

As this extension seems likely of interest to only a very small fraction of the Python user base, this page won't cover its technical components any further. Please see the PEP on python.org, especially its networking example; its perhaps inadvertent illustration of the proposal's redundancy; and its full changes list. Python 3.5's standard docs are still a bit lacking on this front, but may improve with time.

This extension's non-technical aspects, though, deserve all Python users' attention: this change adds yet another layer of complexity to the Python language without clearly valid cause. It seems another example of language design gone bad, on three grounds:

The last point has become an unfortunate pattern. This specific extension seems to have been lifted almost verbatim from other languages including C#/VB and PHP/Hack, so much so that those languages' documentation applies to Python. In the Python world, core developer focus seems to have shifted quickly from functional programming to type declarations and coroutines; where it moves next depends not on users, but only on who shows up to change it—and what other language they wish to imitate.

There is still a large subset of Python 3.X which is quite usable for applications development. See the programs here for typical examples; if programmers have enough common sense to stick to this subset, Python is what it always has been. However, Python 3.X also resolutely continues to be a bloated sandbox of often-borrowed ideas, where each current set of people with time to kill and academic concept to promote inserts changes based entirely on personal interest, that inevitably become required knowledge for all.

The end result of this constant-change model is that those who have used Python for decades can now be given a piece of Python 3.X code to use or maintain, and have absolutely no clue what it means. That's even true of programmers who have used only former releases in the 3.X line. This may boost the egos of a small number of individuals responsible for a change—and ultimately might just serve to maintain the power of a self-sustaining inner circle possessing the latest obscure "special" knowledge.

But it's not necessarily good engineering.

Major Changes in Python 3.4 (March 2014)

[See the top of this page for an index to items in this section.]

The following sections provide brief looks at noteworthy Python 3.4 changes. In terms of language changes, Python 3.4 proved to be a fairly minor release, though the later 3.5 resumed the 3.X rapid-change paradigm.

Update: see also the new Python Pocket Reference, 5th edition, revised for Python 3.4 and 2.7, and available January 2014. It covers some 3.4 topics discussed here.

1. New package installation model: pip

Shortly after this edition was published, Python's core developers finally settled on pip as the officially sanctioned system for installing 3rd-party Python package, with setuptools and other 3rd-party systems recommended for package creation. These are envisioned as superseding the longstanding distutils. Specifically (and currently):

  1. Python 3.4 makes the externally maintained pip system available automatically, and comes with updated installation manuals that reference it almost exclusively.

  2. Pythons 2.7 and 3.3 are to have updated installation manuals that recommend pip instead of distutils, but will not ship with pip included. This step seems still pending; as this note is being written in July 2014, 2.7's manuals reference distutils.

  3. Package creation tools such as setuptools are in the 3rd-party domain and not shipped with Python, but can be installed with pip.

  4. The formerly sanctioned—and widely used—distutils will still be available in the standard library of all Pythons.

This is a tools issue, and not related to the language or its standard library directly. As such, it has little impact on the book, except for the reference to distutils—and its impending deprecation in favor of a then-named "distutils2"—in the sidebar on the subject on pages 684-685, and a brief reference on pages 731, 1159, and 1455. The book correctly predicted distutils' demise, but could not foresee pip.

For more details on pip and this change, see the install manuals for Python 3.4 or later (and perhaps 2.7), as well as the change's PEP and recommended tools lists:

Of note: unlike the formerly standard distutils, pip is not in the standard library, but is a separate system bundled with Python as of 3.4 only. It must be retrieved and installed by the user in Python 2.7 or 3.3, which makes its uptake uncertain. Or, to quote from the new Python 3.4 install documentation developed by a new group with the curious (and perhaps even Orwellian!) title "Python Packaging Authority":

* pip is the preferred installer program. Starting with Python 3.4, it is
included by default with the Python binary installers.

* distutils is the original build and distribution system first added to
the Python standard library in 1998. While direct use of distutils is 
being phased out, it still laid the foundation for the current packaging
and distribution infrastructure, and it not only remains part of the 
standard library, but its name lives on in other ways (such as the name
of the mailing list used to coordinate Python packaging standards 
development).

In other words, the net effect is yet another new and redundant way to achieve a goal—and a potential doubling of the knowledge requirements in this domain. Improvements are warranted and welcome, of course. Regrettably, though, change in open source projects is often more focused on the personal preferences of a few developers, than on current user base or complexity growth. As always, the merits of such mutations are best decided by practice in the larger Python world.

2. Unpacking "*" generalizations? (postponed to 3.5)

After being debated since 2008, a subset of the unpacking "*" syntax generalization originally slated for release 3.4 was finally implemented in 3.5—albeit with a limited scope to make it a bit more palatable.

Namely, the unpacking stars ("*") will work in function calls as before, and in data-structure literals with this change, but not in comprehensions due to readability concerns. Moreover, the full proposed relaxation of ordering rules in function calls was also abandoned in the end due to lack of support. Stars also still work to collect items in function headers and assignments as before.

The original description of the change here has been moved to the Python 3.5 section's coverage of its final form: see above for more details, and see this change's PEP for the full proposal.

3. Enumerated type as a standard library class/module

Since its inception, Python has allowed definition of a set of identifiers using a simple range: for example, "red, green, blue = range(3)". Similar techniques are available with basic class-level attributes, dictionary keys, list and set members, and so on.

As of Python 3.4, a new standard library module makes this more explicit, with an Enum class in a new enum module that offers a plethora of functionality on this front. It employs class-level attributes to serve as identifiers, but adds support for iterations, printing, and much more. An example borrowed from Python Pocket Reference, 5th Ed:

>>> from enum import Enum
>>> class PyBooks(Enum):
        Learning5E = 2013
        Programming4E = 2011
        PocketRef5E = 2014

>>> print(PyBooks.PocketRef5E)
PyBooks.PocketRef5E
>>> PyBooks.PocketRef5E.name, PyBooks.PocketRef5E.value
('PocketRef5E', 2014)

>>> type(PyBooks.PocketRef5E)
<enum 'PyBooks'>
>>> isinstance(PyBooks.PocketRef5E, PyBooks)
True
>>> for book in PyBooks: print(book)
...
PyBooks.Learning5E
PyBooks.Programming4E
PyBooks.PocketRef5E

>>> bks = Enum('Books', 'LP5E PP4E PR5E')
>>> list(bks)
[<Books.LP5E: 1>, <Books.PP4E: 2>, <Books.PR5E: 3>]

As Python programmers seem to have gotten along fine without an explicit enumerated type for over two decades, the need for such an extension seems unclear. And to some, the numerous options afforded by the new class may also seem like over-engineering—a sort of identifier enumeration on steroids. Like all additions, though, time will have to be the judge on this module's applications and merits.

For more details, see Python 3.4's What's New document, or the change's PEP document. Note that this is a new standard library module, not a change to the core language itself, and should not impact working code; some standard library modules may, however, incorporate this new module to replace existing integer constants, with effects that may remain to be seen.

4. Import-related standard library module changes

A large number of 3.4 changes have to do with its modules related to the import operation—site of a major overhaul in 3.3 for both import in general, and the new namespace packages described in the book. As this is a relatively obscure functional area which is unlikely to impact typical Python programmers, see 3.4's What's New document for more details, especially its section Porting to Python 3.4.

5. More Windows launcher changes and caveats

See this page for additional issues regarding the Windows launcher shipped and installed with Python 3.3 (and later), beyond those described in Learning Python, 5th Edition's new Appendix B. In short:

Given these and other issues covered both in the book and on this page, it seems the 3.3+ Windows launcher aspires to be a tool that makes it easy to switch between installed Pythons, but might have worked better as an optional extension than a mandatory change.

Update: see also the later Python 3.6 launcher defaults change described above.

6. And so on: statistics module, file descriptor inheritance, asyncio, pypy3, Tk 8.6, email

Other 3.4 items of interest:


[Python Logo] Home Books Programs Blog Python Author Training Email Search ©M.Lutz