[LP5E cover]

LP5E: Recent Reader Queries


Last revised: July 2022

This page hosts curated replies to reader emails and posts, for questions that either reflect issues raised by multiple readers, or otherwise seem of potentially broader interest. Content here is primarily related to the book Learning Python 5th Edition (LP5E), but a few broader book posts and observations have managed to sneak in as well.

Items here are posted as time allows (and a lot of great questions never made this page). Some replies have been expanded and/or polished for the broader audience; all personal details have of course been removed; and this page's organization is as informal as its content—it's grouped into more- and less-technical topics, and recent additions generally appear at the end of both the lists and this page at large.

Related Resources

For more book-related resources, be sure to also see these external pages:

Content Here

More Technical

Less Technical



More on classmethod versus staticmethod

[Jul-2014] Learning Python, 5th Edition has an in-depth section on staticmethod and classmethod in Chapter 32, which includes examples of counting instances of classes in a small framework with both tools. In response to reader queries, I recently posted an even smaller, self-contained example which may help some readers further clarify the distinction between these tools:

Read the code here


Formal Inheritance Rules: PyRef5E Excerpt

[Jul-2014] Learning Python, 5th Edition covers new-style inheritance in full by tutorial and example, especially in the Metaclasses Chapter 40 in its Advanced Topics part. As a supplement for this book's readers, see also the following concise summary of Python's formal inheritance rules, a preview excerpted from Python Pocket Reference, 5th Edition:

Read the excerpt here


More Benchmarks: str.format() Is Slower than % Formatting

[Mar-2014] A Python Pocket Reference reader suggested that the str.format() method should be avoided in programs where performance matters. Per benchmarks, str.format() currently runs 30%-40% slower than the % formatting expression, and in some cases as much as 10X slower. This is an important optimization finding in itself, but also touches on the benchmarking topics in Learning Python, 5th Edition (which has a new full chapter devoted to the subject), and serves as an opportunity for a supplemental example here.

For context, the reader note/reply appear below. The related content:

> -----Original Message-----
> From: ...
> To: lutz@rmi.net
> Subject: Python Pocket Reference: Performance of string formatting methods
> Date: Tue, 04 Mar 2014 18:48:56 -0500
> 
> [...]
>  
> In the chapter about "String formatting" you write that there is no
> compelling reason to prefer the '%' over the 'format()' method and vice versa. 
> Maybe you could briefly mention in the next edition of the book that '%' is 
> currently a lot faster (in Python 2.x and Python 3.x) than the 'format()' 
> method call. I did a quick comparison not too long ago if you are interested: 
> http://sebastianraschka.com/Articles/2014_python_performance_tweaks.html#string_assembly
> This might be worthwhile considering for extensive (and expensive) string 
> operations.


A follow-up: thought you may be interested in some quick
benchmarks I just ran for str.format() and %, on 3.3.4, 
2.7.3, and PyPy 1.9.  Script and results are attached.

Please draw your own conclusions, but it appears that  
str.format() is consistently on the order of 30% to 40% 
slower than % in my tests, when other factors are removed.  

As usual, though, this can vary by test context.  Your 
comprehension-based test code does indeed show str.format()
10X slower than both + and %, though that higher number may 
be a function of the list comprehension itself.  When
a function call is added to the % equivalent, it's almost
as slow as str.format(), though still slightly quicker.  

In other words, it may be that the reason for the 10X 
slowdown for str.format() is the cost of its inherent 
function call (always slow in Python).  Within a list 
comprehension (only?), Python seems to be able to run 
a % as a quicker in-line expression instead.  In other 
contexts, the function call cost seems to be moot.

That seems odd — so much so that I wonder if my tests
are missing something.  In any event, please take these
as preliminary, and feel free to follow-up if you find 
something askew.  Still, it is true that % is quicker in 
all Pythons, by at least a double-digit percentage: a big
factor for some code, and cause to prefer % in general.  

It's also true that 2.X almost always checks in faster than
3.X in these tests (and PyPy is stunningly faster than both),
but that's fairly normal.

--Mark Lutz (http://learning-python.com)


New-Style Inheritance "Breadth-First" Search Order in Diamonds

[Oct-2013] A reader wrote to ask about the superclass search order of new-style classes in diamond-pattern inheritance trees—which seems as good an excuse as any to clarify this subtle matter here. Also note that Python Pocket Reference, 5th Edition now includes a new sketch of this post's topic; see the excerpted section on this query's topic described above.

> -----Original Message-----
> From: ...
> To: lutz@learning-python.com
> Subject: A question in 'Learning Python 4e'
> Date: Sat, 19 Oct 2013 10:57:48 +0800
> 
> I have question about 'Diamond pattern' in chapter 31 at page 
> 783. (Learning Python 4e)
> 
> I write a test code like this:
> class A:
>      attr = 1
> 
> class E:
>      attr = 5
> 
> class F(A):
>      pass
> 
> class B(F):
>      pass
> 
> class C(A):
>      attr = 2
> 
> class D(B, C):
>      pass
> 
> x = D()
> print(x.attr)
> 
> If I run it in python3, it always shows 5.
> But if the search breadth-first, it should show 2.
> 
> What's the real search order of this code?
> 
> -- 
> Best Wishes!
> 


Hmm; this can't be the code you're testing, as it cannot
produce the result "5" under either Python 2.X or 3.X.  
Your class "E" with attr=5 is not referenced anywhere,
and is not part of the class tree searched by inheritance
from class D, as the following ASCII sketch tries to show:

   A:1
   |
   F     A:1
   |     |
   B     C:2
    \   /
      D
      |
      X

In fact, your code runs fine with "E" commented-out entirely.
This was probably just a typo in your email, but be sure you
are testing the code in the book exactly.  As show in the book,
(and even as given in your email with the unused "E" class),
this example:

--In Python 2.X, prints "1", because of 2.X's strict depth-first
  then left-to-right DFLR search order (it reaches "A" through 
  the leftmost branch first).

--In Python 3.X, prints "2", because 3.X's MRO search is more
  breadth-first in diamonds only (it reaches "C" before climbing 
  to any higher "A").

That is:

   C:\temp> py -2 di.py
   1

   C:\temp> py -3 di.py 
   2

Really, the first behavior holds true for classic classes in 2.X,
and the second for both 3.X and new-style classes in 2.X; in terms
of default class models, though, its 2.X and 3.X behavior; that
difference was the main point of this example.

There's expanded and more detailed and formal descriptions of the 
3.X (new-style) MRO search order in this book's new 5th Edition.
Technically, the 3.X order scans trees depth-first and then 
left-to-right collecting all classes along the way, but retains 
only the final (right-most) occurrence of each class in its linear
MRO ordering.  This both orders the search and removes duplicates.

The net effect of this is roughly breadth-first in diamonds (only),
because common superclasses are visited later than in 2.X, and 
just once per inheritance search.  In your example code, the 
DFLR ordering [X, D, B, F, A, C, A] becomes the MRO ordering
[X, D, B, F, C, A], which accounts for C's precedence in 3.X.

That said, I suspect what you're seeing is the effect of changing
B's superclass to E:

   class E:
        attr = 5

   class B(E):
        pass

generating the following different inheritance tree:

   E:5   A:1
   |     |
   B     C:2
    \   /
      D
      |
      X

In this case, both 2.X and 3.X print "5" for the attribute,
taken from class E in the left-most branch.  It's true that 
this is not breadth-first, but this is also not a diamond.  
As stated in the book, the breadth-first effect applies only 
in _diamond_ cases in new-style classes, because all but the
last appearances of common superclasses are filtered out; 
non-diamonds still are effectively DFLR, because each class 
appears just once in the tree.  

More formally, the DFLR and MRO orderings in non-diamonds 
are the same, because there are no duplicates to remove; 
here, both are [X, D, B, E, C, A], which is why the higher
"E" wins in both Python lines.

The 4th Ed of this book was deliberately informal and terse
on this topic due to its complexity, but it gets much fuller 
coverage in the 5th Ed (including 10 pages on the MRO and 23
pages on its super() use case).  Also note that the MRO order
is just something of a nested concept in the complete new-style
inheritance model, a procedure which encompasses descriptors 
and metaclasses, as summarized here:

  http://programming.oreilly.com/2013/07/pythons-new-style-inheritance-algorithm.html

The 5th edition covers this and other topics more completely
too, which accounts for most of its higher page-count.

--Mark Lutz (http://learning-python.com)


Generator State Suspension versus Pickling

[Oct-2013] A reader wrote to ask about the relationship of generators to pickling. Per later interaction, it appears that this issue stems from a web-based Python interpreter emulator's auto-pickling of objects used at the interactive prompt. That is, point #2 below was the culprit, in the context of saving session state when interacting with a web server. If generator or other examples fail for you on pickling errors, try a different—and non-web-based—interpreter interface.

> -----Original Message-----
> From: ...
> To: "lutz@learning-python.com" <lutz@learning-python.com>
> Subject: Question from book
> Date: Tue, 8 Oct 2013 12:09:41 -0400
> 
> Hello Mark!
> 
> First off, I bought your learning python book and want you to know that I 
> really appreciate it and am enjoying it greatly!!
> 
> I am about halfway through it though when I've gotten up to the section on 
> function generators, and when I try to follow your code (page 593-594) I get 
> a 'cannot pickle generators' error.. Am I doing something wrong? The reading 
> I have done so far on the Internet confirms that one cannot pickle a 
> generator, but u seem to do it in the book and I'd like to understand what 
> I'm missing. 
> 
> Thanks!
> ...
> 
> Sent from my iPhone
>


Thanks for your note.  As for your pickle issue: I'm not
sure I can tell how this arose from your note alone, but
generators are a notoriously confusing topic.

The generators coverage and examples around page 593-594 
don't mention or use pickling in any way.  Pickling is 
presented in the Chapter 9 files section, but then not 
mentioned again until the OOP part, where it is deployed
anew.  So, my best theories are that either:

1) Perhaps you're assuming that the state suspend/resume 
behavior of generators implies pickling to a file?  It 
doesn't — generators are purely in-memory tools.  They 
suspend their state in memory when they yield() a result, 
and resume that in-memory state on later next() calls to 
pick up where they left off.  Really, their "state suspension"
is simply remembering variable values and code location.  
External files don't enter into it, and there is no need
to use the pickle module in any fashion to make generators 
work.  They're just resumable functions in memory.

2) Perhaps you're working in an IDE that attempts to pickle
objects in your interactive session automatically?  This 
sounds error-prone to me, and doesn't happen in IDLE or at 
a command-line shell prompt interface; but it's conceivable
that some interfaces may try to save your interactive state
as you work, to allow you to pick up where you left off in 
a prior session (or web page).  This seems a bit far-fetched,
though; beyond generators, nothing with system state would 
work at the interactive prompt in such a tool, including files.

For reference, below my signature line is the code I think 
you're referring to, working in Python 3.3 (in IDLE; a shell
prompt is the same, but the prompt may act a bit different).  
If I've misread your question entirely — always a possibility
with email — please feel free to clarify in a follow-up.

Thanks,
--Mark Lutz (http://learning-python.com)


----
# The page 593-594 code in question?

   >>> def gensquares(N):
           for i in range(N):
               yield i ** 2        # Resume here later
            
   >>> for i in gensquares(5):     # Resume the function
           print(i, end=' : ')     # Print last yielded value
        
   0 : 1 : 4 : 9 : 16 : 
   >>> x = gensquares(4)
   >>> x
   <generator object gensquares at 0x00000000032EEEE8>
   >>> next(x)
   0
   >>> next(x)
   1
   >>> next(x)
   4
   >>> next(x)
   9
   >>> next(x) 
   Traceback (most recent call last):
     File "<pyshell#10>", line 1, in <module>
       next(x)
   StopIteration
   >>>
   >>> y = gensquares(5)
   >>> iter(y) is y
   True
   >>> next(y)
   0
   >>> list(y)
   [1, 4, 9, 16]

# The following fails, but it's not shown or mentioned in the book:

   >>> import pickle
   >>> pickle.dumps(y)
   Traceback (most recent call last):
     File "<pyshell#17>", line 1, in <module>
       pickle.dumps(y)
   _pickle.PicklingError: Can't pickle <class 'generator'>: attribute lookup 
   builtins.generator failed


Code Listings without Outputs?

[Oct-2013] A reader posted on O'Reilly's "Get Satisfaction" support forum with confusion stemming from formatting of code. To head-off further confusion, I also posted this as a clarification to the book's errata list, requesting the following text insert just before this code: "(the following snippets both print Bob's 2-item job list if run live and provided with another record structure)".

> From: O'Reilly Media [noreply.oreilly@getsatisfaction.com]
> Sent: 10/18/2013 10:31 AM
> To: getsatisfaction@oreilly.com
> Subject: New question: don't understand code at bottom of pg 261 of learning python
> 
> ... just asked this question in O'Reilly Media: 
> 
> "don't understand code at bottom of pg 261 of learning python"
> 
> At the bottom of pg 261. why does >>>db[0]['jobs'] gives no response rather 
> than ['developer,'manager]? Same question for top of page 262 for 
> >>>db['bob']['jobs']. 
>


This code is not being run at the interactive prompt which 
prints results; it's just being listed as an example here,
and an abstract one at that.  Notice the lack of a ">>>" 
interactive prompt or bold font in the book, the same as in 
the other output-less code on page 262.  Also notice the 
italicized "other"; it's supposed to stand for another 
record structure.

This code will print what you expect if run live, as in the 
following (I'm using a string for "other"):

   >>> rec = {'name': 'Bob',
   ...        'jobs': ['developer', 'manager'],
   ...        'web':  'www.bobs.org/Bob',
   ...        'home': {'state': 'Overworked', 'zip': 12345}}
   >>>   
   >>> db = []
   >>> db.append(rec)
   >>> db.append('other')
   >>> db[0]['jobs']
   ['developer', 'manager']
   >>>
   >>> db = {}
   >>> db['bob'] = rec
   >>> db['sue'] = 'other'
   >>> db['bob']['jobs']
   ['developer', 'manager']

When in doubt, try running the examples yourself (the book's 
examples package and ebook cut-and-paste can help), and don't
expect to see outputs in the book for code not typed at the 
">>>" prompt.  This is especially true of later, larger code.

--Mark Lutz (http://learning-python.com)


Running Scripts 1: Commands and Prompts

[Sep-2013] Some basic pointers for beginners getting used to where to type commands, courtesy of a fellow learner's query. Today I'd also mention the PyEdit program available on this site (and noted ahead) as another program edit+run option.

> -----Original Message-----
> From: ...
> To: lutz@rmi.net
> Subject: Learning Python, 4th Edition
> Date: Sat, 7 Sep 2013 16:48:54 -0700
> 
> Mr. Lutz,
> 
> I am currently trying to go through the fourth edition of Learning Python.
> I'm on a Mac and I'm using Python 3.1.2.
> 
> I've tried numerous times to get the very first script on page 44 to run,
> yet I get the following error:
> 
> Traceback (most recent call last):
>     File "", line 1, in 
>        print (script1.py)
> NameError: name 'script1' is not defined
> 
> Now, I've tried just entering script1.py, which has given me the same error.
> 
> As I said, I'm running 3.1.2 using the IDLE launcher. I entered the
> statements just like the example on page 44 into the text editor that's in
> IDLE. I then saved it as script1.py to my desktop (so it's in
> /Users/myname/Desktop/script1.py).
> 
> Any suggestions as to what I might be doing wrong? Maybe I'm not defining
> something correctly or not saving the file to the right location. This is
> my first time programming, so I'm somewhat lost.
> 
> Thank you for any help you can provide. Have a good weekend.
> 


It looks like you may be confusing prompts, a very common
source of frustration for newcomers.

Be careful in this chapter not to confuse the system prompt 
(where you launch Python and script files) with the Python 
prompt (where you run Python program code only).  This is 
stated and alluded to at various points in the chapter.

This means you can't type just "script1.py" at the ">>>"
Python prompt to run a script file.  That's a system command,
to be typed at the system prompt (a terminal window on Macs).
To run a script file from the ">>>" prompt, the chapter later
shows that an "import" statement or "exec" call will suffice, 
but both are Python code typed at ">>>", not system commands. 

This also means that you can't run a script by coding its
name in a Python print() statement; your print is invalid 
syntax (you'd need to quote strings, but that still wouldn't
run an external file of that name), but it also confuses 
system and Python commands. 

In the end, though, you're probably best advised to create
and launch your files with the IDLE GUI at least when 
starting out, as it avoids system prompts altogether.
IDLE details are given in their own section later in the
chapter too.  This chapter is a catalog of launching 
techniques for a broad set of readers and backgrounds,
and you should feel free to pick and choose as you like.

--Mark Lutz (http://learning-python.com)


Books and Other Training Options

[Sep-2013] Someone wrote asking about book and training choices. For a related post, see also the newer notes on book selection on the purchase pointers page.

> -----Original Message-----
> From: ...
> To: lutz@learning-python.com
> Subject: Python Training
> Date: Sat, 7 Sep 2013 12:48:12 +0530
> 
> Dear Mark,
> 
> I am resident of India, and residing @present in Jaipur. I have my computer
> background & worked in the IT industry late back in 1992-93, with
> programming expertise in MySQL. Left the industry in 1995.
> 
> Once again I wish to be part of IT industry & would like to start up with
> Python (advance level training). Should I refer your book or distance
> learning training program if any.
> 
> Kindly advice me on training aspect & if I have to refer your books then
> please let me know the complete name of set of books which will help me in
> programming from basic to advance level of Python.
> 
> Looking forward to your kind reply.
>


My 3 books are designed to work together as a set, and 
function like a self-paced comprehensive Python course:

--"Learning Python, 5th Edition" is a tutorial that covers
foundational concepts of the Python language itself.

--"Programming Python, 4th Edition" is a tutorial that moves
on to explore how to apply Python in common applications.

--"Python Pocket Reference, 4th Edition" is a concise 
reference-only supplement, that complements the other two.

Together, these 3 were written to provide a self-paced 
experience similar to that had by students in the Python
classes I teach, but with substantially more depth.  They 
have been used by many to get started in Python.  They've 
also evolved over the last 2 decades, and so reflect 
changes that have occurred in the software field in general.

That being said, there are many ways to pick up Python,
and many types of readers with differing backgrounds and
goals.  With hundreds of Python books to choose from today, 
I encourage people to weigh the options for themselves.

Some online tutorials and training options are also no 
doubt useful, though I'd recommend caution on that front.
Some offerings may fall short, especially in the for-pay 
category; as is common, Python has grown large enough to 
attract lower quality products more interested in profit
than teaching.

However you proceed, best wishes with your Python
explorations.  You'll probably find it to be a very
productive choice, especially compared to the tools
prominent in the early 1990s (some of which were my 
own inspiration for exploring Python).

--Mark Lutz (http://learning-python.com)


About Future Editions of My Books

Update see also the newer FAQ page about this topic.

[2013-2014] I've received multiple queries about possible upcoming editions of my books after the new 5th Edition of Learning Python appeared in mid-2013. In short:

The first of these is new and doesn't require an update, of course, but here is the best detailed information I have on the other two as of this reply's latest revision.

Python Pocket Reference

I've begun working on a 5th Edition of this book, but it will be months before it is released. When it appears (most likely early in 2014), it will be only a very minor refresh for Pythons 3.4 and 2.7. I do not recommend waiting for this still tentative and minor update if you're looking for a reference resource now. The current 4th Edition is largely current today, as long as you think "3.X" when you see "3.0" or "3.1" per the book's introduction, and browse Python's What's New? documents for recent changes in 2.7, 3.2, 3.3, and 3.4.

Update, Nov-2013 O'Reilly already has a web catalog page for this book, though some of the details are to be tweaked (e.g., it's 230 est. pages, March seems a worst case, and it's been updated for both 3.4 and 2.7). I've also now posted a preliminary page for this book on this site as well, with a draft of its introduction for content details.

Update, Jan-2014 This update is now available; see the aforementioned page for details. In the end, it wound up with 50 pages of new material, including fresh coverage of the MRO and inheritance, super(), namespace packages, enumerations, JSON usage, and more. See the Introduction preview on its page.

Programming Python

There are today no plans for a new edition of this book. Its latest edition less than 3 years old, is already current with 3.X, and is fully relevant as is. Some libraries it uses have changed in minor ways that may imply changes in some example code—for instance, see the latest examples package release for Python 3.3, 3.4, and 3.5—but that's in itself a fair lesson about development; change is a constant in the software world. As warranted, example updates for future Pythons will be posted both here and at oreilly.com.

More generally, while it may be premature to label any of my current publications as final editions, I hope they will continue to serve Python 2.X and 3.X users for many years to come. For future Python changes, watch the books' update pages on this site, as well as Python's own What's New? documents. For more on Python's status, see both the introduction and conclusion to Learning Python, 5th Edition (Chapters 1 and 41).

[Back to Index]


Suggestions for Mac and Windows Users

[2014] A reader wrote with the following suggestion on using Python 3.X on Macs:

> Dear Sir,
> 
> Thank you for your book "Learning Python".
> 
> May I suggest that on your FAQ web page you point Mac users to this page: 
> [edit: this page is now defunct; try this or this]
> 
> I don't think I would have been able to make Python 3 the default 
> version on my Mac without the help provided on this page, and there's 
> nothing worse, when you are trying to learn alone, than to be stuck 
> right at the beginning.

You can also find Mac-specific resources in the notes on this site's Programming Python page, though most of it is GUI related; search for "Mac OS" or "Mac" there, and try the same with the Search button on this page's toolbar below (Mac OS shows up on a variety of pages at this site).

And speaking of getting stuck, some Windows users may benefit from the extra Windows launcher notes on this page; in short, file associations may fail to be set correctly in some installs.

[Back to Index]


About "The Shallows" Mentioned in the Preface

[2014] Some clarifications on recommended reading.

> -----Original Message-----
> From: ...
> To: lutz@rmi.net
> Subject: The shallows
> Date: Tue, 14 Jan 2014 01:07:03 -0800 (PST)
> 
> hi Mark,
> 
> You had paid tribute to a book - The shallows in LP5e. ᅠI'm curious about it.
> 
> Which exact book did you refer to? ᅠThere are quite a number of them with 
> similar title.
> 
> Thank you.


In full detail, the book I mentioned is:
    The Shallows: What the Internet Is Doing to Our Brains 
    by Nicholas Carr

There are Amazon and Wikipedia pages for it here:
    http://www.amazon.com/The-Shallows-Internet-Doing-Brains/dp/0393339750
    http://en.wikipedia.org/wiki/The_Shallows

Per the latter of these, it had a different subtitle when 
published in the UK.

It's a bit controversial, but was a Pulitzer Prize finalist,
and raises questions that deserve to be asked about some of
the Web's impacts, while there may still be time to address 
them.  Its look at cognitive research is illuminating;
contrary to current dogma, the focus-breaking nature of the 
web probably isn't a win for anyone but advertising companies.  
 

Cheers,
--Mark Lutz (http://learning-python.com)

Update, Dec-2014 Carr has a new title released later in 2014: The Glass Cage: Automation and Us, on the deskilling effects of automation in knowledge-based domains, and the dangers of entrusting our society and lives to opaque technologies laden with hidden agendas. It's highly recommended reading for anyone interested in the ethical aspects of the software field (and that should be just about all of us).

Update, Apr-2015 Another reader asked about the implications of The Shallows—see the later related reply on this page.


Matching a User-IDs File to an Email-Addresses file

[Mar-2014] A reader wrote asking for help with a typical file-processing problem: collecting the entries in an email address file that are not also present in a file containing email names (not full email addresses, as I understand the goal) of people who have left the company. This touches on both the basic string processing and file tools in the foundational Learning Python, as well as pattern matching and email parsing topics covered in the application-focused Programming Python:

> [...reader email omitted...]

Have you resolved this yet?  If not, here are some ideas.
The 'in' operator scans for a substring:

   >>> 'bob' in '123bob456'
   True
   >>> ('bob' in 'bob'), ('bob' in 'bob@spam.com'), ('bob@spam.com' in 'bob@spam.com')
   (True, True, True)
   >>> ('bob' in 'other')
   False

It's also not very accurate, as it will find the substring in any context:

   >>> ('bob' in 'bobsled@spam.com'), ('bob' in 'sue@bob.com')
   (True, True)

The str.find() method does the same, but is also inaccurate:

   >>> '123bob456'.find('bob'), 'other'.find('bob')
   (3, -1)

You could resort to a pattern match:

   >>> re.match('.*?(%s).*?@.*' % 'bob', '123bob123@spam.com')
   <_sre.SRE_Match object at 0x0000000002B1B8A0>
   >>> re.match('.*?(%s).*?@.*' % 'bob', '123bob123@spam.com').groups()
   ('bob',)

But it's probably overkill: splitting on "@" and taking the 
result's [0] item will get just the name before the domain:

   >>> 'bob@spam.com'.split('@')
   ['bob', 'spam.com']
   >>> 'bob@spam.com'.split('@')[0]
   'bob'

In the worst case, full-blown email name/address pairs can be
parsed with the email package:

   >>> from email.utils import getaddresses
   >>> pairs = getaddresses(['bob@spam.com', '"Bob" <bob@spam.com>'])
   >>> pairs
   [('', 'bob@spam.com'), ('Bob', 'bob@spam.com')]

Assuming the split('@') suffices, though, the code using it
would look something like this:

   leavers = [line.rstrip() for line in open('leavers.list')]
   for line in open('user.db'):
       user = line.rstrip()
       if user.split('@')[0] not in leavers:
           print(user) # write addr to a file here

And again, set comprehensions might suffice (though this version
prints just the users' names, not their full addresses, and sets
both remove duplicates and reorder the original file's lines):

   leavers = {line.rstrip() for line in open('leavers.list')}
   users   = {line.rstrip().split('@')[0] for line in open('user.db')}
   for notleft in users - leavers:
       print(notleft) # write name to a file here

Beyond this, I recommend working through the book and 
experimenting interactively to see how various tools work.


> -----Original Message-----
> > [...earlier reader email omitted...]
>
> It sounds like you're trying to compute a file difference,
> right? — select all lines in a file that are not in another
> file?  If so, I might code this as follows:
> 
>    leavers = [line.rstrip() for line in open('user.list')]
>    for line in open('user.db'):
>        user = line.rstrip()
>        if user not in leavers:
>            print(user) # eventually I'll write to a file here
> 
> Or, since this seems a set difference, I'd do it with sets
> if the files are small enough to fit in memory, and the db
> contains no duplicate you care to retain in the result (use
> set(L) in 2.6 and earlier, as it has no set comprehensions):
> 
>    leavers = {line.rstrip() for line in open('user.list')}
>    users   = {line.rstrip() for line in open('user.db')}
>    notleft = users - leavers
>    for user in notleft:
>        print(user)
> 
> Perhaps the any() is the confusing bit: it returns true if
> any of its iterations are true — not if its argument is
> simply empty (which you can test without any(), as empty
> means false in Python).
>
> --Mark Lutz (http://learning-python.com)


More on Benchmarking: Understanding timer.bestof() Results

[Jun-2014] The following is a reader thread dealing with some of the subtleties of timing results reported in the book's new benchmarking chapter. This may not make very much sense without the book's context, but underscores common timing factors, and points out a potential Mac-specific issue.

> -----Original Message-----
> From: ...
> To: Mark Lutz 
> Subject: Fwd: Not really errata, but very curious
> Date: Sat, 7 Jun 2014 20:57:25 -0700
> 
> Mark,
> 
> When I replaced time.time with time.perf_counter, the 
> "timer.bestof(1000, str.upper, 'spam')" 0.0 anomalies disappear
> 
>  
> Begin forwarded message:
> 
> > From: ...
> > Subject: Not really errata, but very curious
> > Date: June 7, 2014 at 1:45:03 PM PDT
> > To: Mark Lutz 
> > 
> > Mark,
> > 
> > In chapter 21, following along, I ran the examples you gave in the
> > book, specifically:
> > 
> > timer.bestof(1000, str.upper, 'spam')   
> > (see text below from your examples)
> > 
> > Each time I ran it, I would get (0.0, 'SPAM').
> > This was true even if I reduced the reps to 10.
> > 
> > Could the cause of it be the value of the preset epsilon?
> > I am running python3.4 on a Mac OSX 10.9
> > >>> sys.float_info
> > ...
> > Using print statements, I see multiple occurrences of elapsed 
> > being set to 0.0 on repeated runs with even 10 reps.
> > 
> > I do not expect to see so many computed elapsed times of 0.0
> > 
> > On the same page (see text below),
> > why is the return best value, tuple[0] always greater than the last 
> > computed elapsed time, tuple[1][0]?  That seems counter intuitive to me.
> > 
> > ---------------------------------
> > My PDF page 632 (in part):
> > 
> > [...book text omitted...] 
> >  
> 


Have you worked this one out on your own yet?  If not,
here's a quick review.

1) About the time module calls: no idea why time.time() is
failing on Macs; this is actually fairly disturbing, as 
the newer time.*() calls in 3.3 (per page 633's sidebar) 
are not available until 3.3 on any platform.  All other 
Pythons will have the issue on Macs.  To be explored, but 
I'd also try time.clock() on the Mac, as it's in 2.X too.

2) About the bestoftotal() results: I agree it seems counter
intuitive, but there are two factors contributing to this:

a) There is some small amount of overhead time that elapses 
between the timer() start and stop calls in bestof(), which 
is added to the time reported by the nested total() call's 
time result.  The timer() calls in total() will thus always 
show a shorter elapsed time — which is why it may be better
to use a min(total(....)) approach in general, as explained 
near the top of page 633.

b) Look carefully at the what bestoftotal() is actually 
returning: it's the same as the bestof() call, and contains
the best (i.e., min) time among all timed total() runs in 
tuple[0], along with the _final_ (not best) return value from
total() in tuple[1] — a total time and function result as a
nested tuple.  Hence, the bestof() result's tuple[1][0] time
from total() doesn't necessarily correspond to the best time 
reported in bestof()'s tuple[0] — which is itself skewed a 
bit up by the overhead inside bestof()'s code per the prior 
point.  Complex, but true; these two will likely never agree.

Thanks,
--Mark Lutz (http://learning-python.com)


Programming Concepts for Younger Readers

[Mar-2014] Another reader wrote to ask for advice on using Learning Python and other books for teaching programming concepts to his 12-year old child (not surprisingly, on-line book forums were not as useful as they might have been). There is no generic answer that applies to every learner, but here are a few pointers:

> [...reader email omitted...]


Hello,

I understand your position.  It's difficult to get good advice
on the Internet these days.  Asking an author about his own 
book probably isn't much better most of the time.  But as a
former parent of curious younger children too, I'll try to be 
fair.

Learning Python was designed to teach both programming and
the Python language at the same time.  It deals with common
software design issues like global variables, complexity, 
and even some ethics, to try to help readers become more
rounded practitioners.

That said, this is not a project-based book (it's a didactic
tutorial), so you might also be served by working on a tangible
project in parallel with the book, or surveying some of the 
Python books available for younger learners.  I don't know much
about these, but many are project based, some use gaming as
their vehicle, and some kids respond better to results-oriented
approaches.  If you go that route, I'd recommend combining it
with a more in-depth language resource like Learning Python; 
it's difficult for books that teach gaming to also teach 
Python or programming at large.

If you'd like to browse a sampler of Learning Python, 
O'Reilly has the first chapter posted on-line at the following 
site; this is a nontechnical overview, though, so it doesn't 
do much to show the meat of the book:

http://cdn.oreillystatic.com/oreilly/booksamplers/9781449355739_sampler.pdf

I'm also the author of Programming Python, which is entirely
application-focused, and does have a project focus (it constructs
GUIs, websites, and the like); but it's also too advanced for a
beginner to start out with, and assumes Learning Python's 
material as its prerequisite.  Then again, every learner is
different, so please judge for yourself.

--Mark Lutz (http://learning-python.com)


Software Engineering Isn't Trivial

[Mar-2014] A reader wrote having difficulty learning enough programing on a tight schedule to use a Python scientific library, and asking which parts of Learning Python would suffice in a 2-week window left on an internship. The reader also mentioned that the entire team was getting bogged down in programming tasks for the library, instead of focusing on their core science work. This is a common concern; my reply:

> [...reader email omitted...]


Hello,

I appreciate your dilemma.  Many systems expose a Python 
scripting layer these days, but not all of them properly 
insulate their users from the complexities underlying the 
API.  I don't know if this is the case with your system,
but full-scale software engineering is as complex and
substantial as other engineering domains.  See your
university's computer science degree requirements to 
see what I mean.  There's a reason that many offer BS, 
MS, and PHD programs in the field.

Asking non-practitioners to write basic code can work if 
the system's internals are encapsulated well.  Moreover, 
not all people need full computer science knowledge — 
much as people balancing their checkbooks do not need to 
know calculus or statistics.  Unfortunately, people have 
been sold the idea that programming is somehow trivial for 
everyone (and many seem to have accepted the myth in full).
That makes it all the worse when systems expose too much 
of their implementation.  Again, this may or may not be 
your case, but it's a general concern.

In terms of the book, I suggest that the first few parts up
to the classes/OOP part should suffice for simple, procedural
scripting.  This can probably be covered in weeks with focus.
But that assumes the library you're using does not expose 
classes and OOP, or advanced functional or metaprogramming 
techniques; if it does, you've got more material to cover.  
OOP alone is a software engineer's tool, which in my 
experience is often too much for most people outside the 
field (much like calculus is to those outside science). 

As I mention in the book's Preface, learning a modern 
software tool like Python is not a trivial or quick task.
The new-style inheritance algorithm alone is easily enough
to occupy a week for someone already having a CS degree. 
Add in generators and Unicode, and it's a sizeable effort.

However you proceed, best wishes with your goals.  In a 
perfect world, the advice I'd like to pass on is that if
you and/or your team are getting bogged down in writing 
code, you should either expect to put in the effort required
to learn software engineering well, or hire a professional 
in the field who has.  But I also realize that won't fly for 
people in your position, with tight schedules and a need
to write just enough code to customize a packaged system.

Thanks,
--Mark Lutz (http://learning-python.com)


Tracing Recursive Function Calls

[Aug-2014] A reader wrote O'Reilly's book support forum with questions about recursive call tracing functions included in the book's examples package. The dialog, with a few minor elaborations in the reply:

> -----Original Message-----
> From: BookTech 
> To: "lutz@rmi.net" 
> Subject: FW: Question regarding Learning Python 5E Mark Lutz
> Date: Tue, 5 Aug 2014 23:07:29 +0000 (GMT)
> 
> Hi Mark,
> Can you clarify this for the reader?
> 
> --------------- Original Message ---------------
> To: bookquestions@oreilly.com
> Subject: Question regarding Learning Python 5E Mark Lutz
> 
> Hi,
> 
> On page 559 the author refers to a script called sumtree2.py.
> 
> The scripts has a lambda function called trace which the book quotes "*It
> adds items list tracing so you can watch it grow on both schemes, and can
> show numbers as they are visited so you see the search order.*"
> 
> Unfortunately my script compiles so fast that all of the date shows up
> within a fraction of a second and unfortunately you can't see this
> phenomenon.
> 
> I tried to modify the code to allow the trace function to be a delay
> mechanism:
> 
> for i in [trace(items)]*int(1E6):
>     print('')
> 
> But that didn't work either.
> 
> Am I misunderstanding the text or am I missing something completely?
> 
> Thanks!
>


Hello,

This is about code not shown in the book, but included in the
examples package for readers to experiment with on their own;
it uses book examples, but adds tracing code and another variant.

In short, the "watch it grow" remark wasn't meant to imply that it 
would trace its progress slowly enough to observe in real time; it 
simply means that the included code can print status that can be 
inspected after the fact, to better understand the recursive calls.

Really, the included code simply has hooks for extra tracing, which
isolate display logic so that readers can flesh them out as desired. 
As provided, its two display functions print nodes as visited, but 
not the stack/queue lists:
 
   # as coded: show visits only
   trace = lambda x: None                 # or print
   visit = lambda x: print(x, end=', ')

Here are some pointers for alternative ways to code these:


1) If you want to also see the lists as they grow and shrink, make
trace a synonym for the 3.X print function (and add the required
__future__ import in 2.X at file top if you're using that version):

   # show lists too
   trace = print
   visit = lambda x: print(x, end=', ')


2) If you really want to slow the progress down, simply add a 
call to the time.sleep() function to pause between prints; see
library manuals for more on this call:

   # show lists only, and slowly
   import time                                     # insert a pause
   trace = lambda x: (print(x), time.sleep(0.5))   # secs, need parens
   visit = lambda x: None

[Sidebar: the parens are required in this lambda, because the
tuple "," here outranks the lambda expression.  Or, equivalently,
"lambda" binds tighter than ",", which is a tupling operator of
of lowest precedence, but only in contexts like these that don't 
treat "," specially.  Without parens, this lambda expression would
end after print(x) — yet another reason to code tuples in parens!]

Here is this version's output for the first breadth-first search:

   [1, [2, [3, 4], 5], 6, [7, 8]]
   [[2, [3, 4], 5], 6, [7, 8]]
   [6, [7, 8], 2, [3, 4], 5]
   [[7, 8], 2, [3, 4], 5] 
   [2, [3, 4], 5, 7, 8]
   [[3, 4], 5, 7, 8]
   [5, 7, 8, 3, 4]
   [7, 8, 3, 4]
   [8, 3, 4]
   [3, 4]
   [4]
   36


3) For full and gradual status, try something like this:

   # show lists slowly, visits slower
   import time
   trace = lambda x: (print(x), time.sleep(0.25))
   visit = lambda x: (print('=>', x, sep=''), time.sleep(0.5))

The last of these shows both lists and nodes visited, with a
quarter-second delay between list displays; here's what this
shows gradually for the first breadth-first traversal:

   [1, [2, [3, 4], 5], 6, [7, 8]]
   =>1
   [[2, [3, 4], 5], 6, [7, 8]]
   [6, [7, 8], 2, [3, 4], 5]
   =>6
   [[7, 8], 2, [3, 4], 5]
   [2, [3, 4], 5, 7, 8]
   =>2
   [[3, 4], 5, 7, 8]
   [5, 7, 8, 3, 4]
   =>5
   [7, 8, 3, 4]
   =>7
   [8, 3, 4]
   =>8
   [3, 4]
   =>3
   [4]
   =>4
   36


4) And finally, you can pause for a user Enter key press at
each step, if you really want an interactive trace (though
you'll probably tire quickly of keypresses for large trees):

   # show lists slowly, pause after visits
   import time
   trace = lambda x: (print(x), time.sleep(0.5))
   visit = lambda x: input('=>%s' %  x)           # 2.X: raw_input()


--Mark Lutz (http://learning-python.com)


More on Class Factories

[Nov-2014] A reader asked for clarification on an abstract code snippet in the Learning Python, 5th Edition:

> -----Original Message-----
> From: ...
> To: "lutz@rmi.net" 
> Subject: Question - Learning Python
> Date: Thu, 6 Nov 2014 20:11:01 +0000
> 
> Hi Mark,
>                 Thank you very much for make available a comprehensive, 
> in-depth Python book for readers like me. The book is clearly elaborated and 
> carefully arranged. I enjoyed reading the book and practicing with the 
> examples with little difficulty. If I do have any questions so far, here is 
> one: in page 956, classname parsed from the configuration file was not used 
> as the first argument of the factory function. Instead, aclass returned from 
> getattr() call was used. Could you explain what is the difference between 
> those two names?
> 
> 
> Regards,
> ...
>


Thanks for your note and feedback.  About your question:
the example is correct as shown, but very sketchy and
understandably confusing.  The idea it means to capture 
is that:

  1) The string name of a class, "classname", is parsed
  from a text file.

  2) This string name is then used by getattr() to fetch 
  a class object, "aclass".

  3) This class object is than called to generate an 
  instance, "reader", which is passed on to other code.

So, "classname" is used for getattr() to fetch the "aclass" 
object, which is then passed on to factory().  factory() 
doesn't need the string name "classname" anymore, because 
it uses the already-fetched "aclass" object directly.

The suggested application of this is that the "classname"
string might be entered in a GUI, and used to fetch and
then call the class object to make an instance dynamically.

Thanks,
--Mark Lutz (http://learning-python.com)


More Fun with Comprehensions

[Aug-2014] Learning Python 5E has two full chapters on iteration, comprehensions, and generators. I've coded a few additional examples recently that might shed more light on these tools—though, like many comprehensions, they seem to require moments of great clarity more than some coding options might. See the recent example programs table for more on the two use cases mentioned below.

From the mergeall System:

# find differing bytes in two files, named by path1 and path2;
# assumes the files are small enough to fit in memory all at once:
# enumerate() and zip() are both iterables on 3.X that defer their
# results, but reading a file's bytes pulls them all into memory;

bytes1 = open(path1, 'rb').read()
bytes2 = open(path2, 'rb').read() 
[(ix, (b1, b2)) for (ix, (b1, b2)) in enumerate(zip(bytes1, bytes2)) if b1 != b2] 

Run live:

>>> path1 = r'c:\marks-stuff\sheets\somefile.XLS'
>>> path2 = r'c:\users\mark\desktop\somefile.XLS'
>>>
>>> bytes1 = open(path1, 'rb').read()
>>> bytes2 = open(path2, 'rb').read()                               # read raw bytes
>>> bytes1 == bytes2
False
>>>
>>> bytes1[:8], bytes2[:8]
(b'\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1', b'\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1')
>>>
>>> zipped = list(zip(bytes1, bytes2))
>>> zipped[:5]
[(208, 208), (207, 207), (17, 17), (224, 224), (161, 161)]          # combined bytes
>>>
>>> hex(208), hex(207), hex(17)
('0xd0', '0xcf', '0x11')
>>>
>>> [(b1, b2) for (b1, b2) in zipped if b1 != b2]                   # what differs?
[(250, 91), (11, 128), (165, 245), (157, 242), (173, 175)]
>>>
>>> len([(b1, b2) for (b1, b2) in zipped if b1 != b2])
5
>>> [(ix, (b1, b2)) for (ix, (b1, b2)) in enumerate(zipped) if b1 != b2]      # where?
[(145517, (250, 91)), (145518, (11, 128)), (145519, (165, 245)), (145520, (157, 242)), (145521, (173, 175))]

From the pystockmood System:

# list construction and filtering, for terms used in text matching;
# combine nouns with verbs, and remove any items having duplicate
# prefixes, else they will match to the subject text redundantly;

nouns = ['wall street',  'stocks', 'markets']                       # plus others
verbs = [('rises', 'falls'), ('rose', 'fell'), ('rise', 'fall')]    # [(good, bad)]

# combine noun/verb
goodterms = [(noun + ' ' + good) for noun in nouns for (good, bad) in verbs]
badterms  = [(noun + ' ' + bad)  for noun in nouns for (good, bad) in verbs]

# fixup: 'x rise' is a prefix of 'x rises' => count for first term only!
goodterms = [term for term in goodterms if not
                  [other for other in goodterms
                             if other != term and term.startswith(other)]]

badterms  = [term for term in badterms if not
                  [other for other in badterms
                             if other != term and term.startswith(other)]]

Run live:

>>> nouns = ['wall street',  'stocks', 'markets']                   
>>> verbs = [('rises', 'falls'), ('rose', 'fell'), ('rise', 'fall')]
>>>
>>> goodterms = [(noun + ' ' + good) for noun in nouns for (good, bad) in verbs]
>>> goodterms
['wall street rises', 'wall street rose', 'wall street rise', 
 'stocks rises', 'stocks rose', 'stocks rise', 
 'markets rises', 'markets rose', 'markets rise']
>>>
>>> goodterms = [term for term in goodterms if not
...                   [other for other in goodterms
...                              if other != term and term.startswith(other)]]
>>> goodterms
['wall street rose', 'wall street rise', 
 'stocks rose', 'stocks rise', 
 'markets rose', 'markets rise']


That Weird Iterator Example Sidebar in Chapter 20

Update, Nov-2019 Python 3.7 unfortunately made a major change to generator semantics which broke this book example altogether. The code below is still illustrative and can still work as shown, but you must recode it with an explicit return for use in 3.7 and later, per the full coverage ahead (and the personal opinions of people with time to break others' code).

[Dec-2014] A reader wrote seeking clarification about the iterators example in the sidebar on pages 621-622 (644-645 in later printings), titled Why You Will Care: One-Shot Iterations. The example:

def myzip(*args):
    iters = map(iter, args)
    while iters:
        res = [next(i) for i in iters]
        yield tuple(res)

taken verbatim from Python's standard manuals, works in 2.X but fails in 3.X, and was included as a prime example of the possibly unexpected consequences of 3.X iterator changes. In short, the change of map() results from lists to iterables is not just an interactive display issue; it can also lead to very subtle errors in 3.X, especially for code that expects the former list-like iteration behavior.

The incorrect code was included only to illustrate this point. Its intent was not to pick on Python manual writers; but if they didn't catch this, chances are good that it may trip up an unwarned LP5E reader too. I've trimmed much of the reader's mail below, as it chronicled a long and winding road in search of answers from other resources, including Stack Overflow, manuals, and mailing lists; indeed, this issue seems only weakly understood in the Python world at large. In any event, it has proven confusing for enough readers to merit a reply paste here.

-----Original Message-----
> From: ...
> To: lutz@rmi.net
> Subject: Iterators and iterables (string vs list)
> Date: Sun, 7 Dec 2014 12:01:40 +0200
> 
> I'm currently EE student (last year) and I'm studying Python for my B.Sc.
> project.
> 
> For the above purpose, your book was chosen.
> Yep, till now it's the best way for me to learn this language from scratch.
> (just tried to make some compliment here ^_^).
> 
> Note: before actually writing to you, I tried to solve this issue by myself:
> googled it, and wrote the letter to python mailing lists (but till this
> moment they were pretty silent with answers). 
> 
> Anyway, in order to save your time I'll jump to the issue itself.
> At this moment I got to chapter 20 which talks about Comprehensions and
> Generations.  By the end of this chapter you provided some example  
> regard to "myzip" function and inherent iteration issues in it.
> 
> So here is a quote from your book(pages: 621-622, "Learning Python", 5-th Ed.):
> [...]
> 


First off, this example, taken from the Python manuals, has been 
notoriously confusing to many readers, and you've clearly done 
much valuable research on this already.  Replies to your three
somewhat related questions:



> 1. What actually map() trying to do in Python 3.X? 
> 
> I mean, why is this works fine:
> >>> L = [1, 2, 3, 4]
> >>> k = iter(L) - what actually happens here?
> >>> next(k)
> 1
> and so on.


This creates an iterator on the list L itself — an object 
that produces the items in the list upon next() requests: 
the integers 1, 2, and so on.



> But not this: 
> >>> list(map(iter, L)) ---- and what happens here?
> TypeError: 'int' object is not iterable


The map(F, I) call applies function F to each item in iterable I.
It creates the results series:

   F(I.next()), F(I.next()),  F(I.next()), ...

In 2.X, map() produces and returns the results of this process 
all at once in a new list.  In 3.X, map() returns an iterable 
result object that delays the work until it's asked for a next 
result; the list() forces this object to produce all its results 
at once, and stores them in a new list for display or other 
purposes.

In your specific usage — list(map(iter, L)) — the map() call
is trying to apply iter() to each item _within_ iterable L, 
not to L itself (as in your prior code).  In equivalent 
indexing notation, this produces results series:

   iter(L[0]), iter(L[1]), iter(L[2]), ....

This won't work because the items within L are integers which 
do not support iteration.



> 2. Why strings are allowed(privileged)  "to become" an
> iterators(self-iterators)? 


Because strings are always sequences of 1-item strings.  That 
is, strings are iterable themselves, but so are their individual
components by definition.

This property is unique to strings, and stems from the fact that 
Python has no distinct type for individual characters.  In C, 
for example, strings are arrays of characters, and characters 
are atomic data items, that (usually) correspond to byte values. 
In Python, there are only strings, whose components are also 
strings of length 1; hence, the characters in a string are
themselves strings, and may be indexed, sliced, iterated, etc.:

   >>> x = 'spam'
   >>> x[0]               # first item in a string
   's'
   >>> x[0][0]            # but it's also a string of len 1...
   's'
   >>> list(iter(x))      # iterate over string itself
   ['s', 'p', 'a', 'm']
   >>> list(iter(x[0]))   # iterate over string's component
   ['s']

This doesn't generally work for lists or tuples, which are 
heterogeneous collections of arbitrary object types — except,
of course, for items in a list that happen to be 1-character 
strings:

   >>> y = [1, 'p', 2]
   >>> y[0]               # first item is an integer 
   1
   >>> y[0][0]            # not a sequence or iterable
   TypeError: 'int' object is not subscriptable
   >>> y[1][0]            # but second item is... 
   'p'
   >>> y[1][0][0]         # and so on: str[i] is always a str
   'p'

Confusing, perhaps, but it's a fundamental Python design 
choice, and hopefully clarifies the rest of your question:
items in a string are nested strings, and hence iterable
themselves, but that's not usually true for other more 
general collection types that are not homogenous:


> I mean why, is this possible:
> >>> print(list(map(iter, S)))
> [<str_iterator object at 0x02E24FF0>, 
> <str_iterator object at 0x02E24CF0>, 
> <str_iterator object at 0x02E24E10>,
> <str_iterator object at 0x02E24DF0>]     
> 
> I'm just trying to say, is that if I wouldn't tried to run the book's
> example with integer arguments (or tuples or lists as arguments)  it 
> wouldn't alarm this issue.
> 
> And I would have lived happily assuming that I understand iterables. ))))
> Those examples works fine with strings but not with list/tuples etc.



> 3.	The last question
> You say:
> " But it falls into an infinite loop and fails in Python 3.X, because 
> the 3.X map returns a one-shot iterable object instead of a list as 
> in 2.X. In 3.X, as soon as we've run the list comprehension inside 
> the loop once, iters will be exhausted but still True. [...]
> To make this work in 3.X, we need to use the list built-in function 
> to create an object that can support multiple iterations". 
> (Like:"Wat?!" ^_^)


Well, a list.  Lists support multiple iterations (scans), but 
map() result objects do not.  You cannot rescan a map() result
by itself more than once, because it's empty after the first 
scan (in the book's phrasing, it's a "one-shot iterator").  
But wrapping a map() result object in a list() call collects 
its items in a list object which does allow multiple scans:

   C:\...> py -3
   >>> L = [1, 2, 3, 4]
   >>> [x * 2 for x in L]          # iterate across a list
   [2, 4, 6, 8]
   >>> [x * 2 for x in L]          # we can go again here...
   [2, 4, 6, 8]

   >>> M = map(abs, [1, 2, 3, 4])  # abs(X) simply returns X here
   >>> [x * 2 for x in M]          # iterate over a map() result
   [2, 4, 6, 8]
   >>> [x * 2 for x in M]          # <== but it's empty after 1 pass...
   []

   >>> LM = list(map(abs, [1, 2, 3, 4]))
   >>> [x * 2 for x in LM]
   [2, 4, 6, 8]
   >>> [x * 2 for x in LM]         # copying to a list works...
   [2, 4, 6, 8]

   >>> M = map(abs, [1, 2, 3, 4])
   >>> [x * 2 for x in M]
   [2, 4, 6, 8]
   >>> M = map(abs, [1, 2, 3, 4])  # or make a new map() object...
   >>> [x * 2 for x in M]
   [2, 4, 6, 8]



> Why the infinite loop would be there and why should list() to make it
> finite?  o_0 


This is probably the most confusing part of the manual's arguably
confusingly coded example, and its failure in 3.X.  It occurs 
because once the map() result object is emptied by a single scan, 
it's always considered Boolean True, despite its empty status:

   >>> M = map(abs, [1])
   >>> next(M) 
   1
   >>> next(M)                     # now always empty in 3.X
   StopIteration
   >>> bool(M)                     # <= but True nonetheless...
   True

   >>> I = iter(M)                 # still empty/True if new scan tried
   >>> next(I)
   StopIteration
   >>> bool(M)
   True

This throws off the example's logic, triggering the infinite loop
in 3.X.  Specifically, the map() result object produced by the
example's: 

   iters = map(iter, args)

is empty after its first scan of its iter() results within the 
loop, but also True thereafter:

   >>> iters = map(iter, ([1], [2, 3], [4]))
   >>> [next(i) for i in iters]
   [1, 2, 4]
   >>> bool(iters)
   True
   >>> [next(i) for i in iters]
   []
   >>> bool(iters)
   True
   >>> [next(i) for i in iters]      # infinite loop time...
   []
   >>> bool(iters)
   True

A list() call avoids this by allowing for multiple scans in 
3.X, and it's a non-issue in 2.X because map() returns a new 
list anyhow; in either case, the StopIteration is thrown in 
the loop correctly when any argument's iterator is exhausted.
Trace through the code again to see why.


[Some of this reply may show up (anonymously, of course) on my 
recent FAQs page, because it's officially attained common status.]

--Mark Lutz (http://learning-python.com)


More on map() with Differing-Length Iterables

[Jan-2015] A reader wrote seeking clarification on a Python 2.X failure for a map() example in the book. The edits this dialog spawned are recorded on the book's errata page at O'Reilly's site (look for page 617 there). The following gives the original email, followed by the discussion text from the errata post. This is mostly for 2.X readers (who may see the error), but it also is a general summary of the map() function's behavior in both lines.

> -----Original Message-----
> From: ...
> To: lutz@rmi.net
> Subject: Error? Learning Python 5e, p. 617
> Date: Wed, 24 Dec 2014 13:04:53 -0600
> 
> Hello ... enjoying the book so far. I am getting an error message for a mapping 
> function on p. 617 (“Example: Emulating zip and map with Iteration Tools), 
> using Python 2.7. I tried checking online resources, including errata on your 
> book’s website, to no avail.
> 
> The issue seems to be that map(pow, list1, list2) can’t tolerate lists of 
> different lengths (in contrast to the zip functions earlier on the page). 
> 
> Whereas:
> 
>     >>>map(pow, [1, 2, 3], [2, 3, 4])
> 
> returns:
> 
>     [1, 8, 81]
> 
> adding 5 to the second list to match the book’s example:
> 
>     >>>map(pow, [1, 2, 3], [2, 3, 4, 5])
> 
> results in:
> 
> Traceback (most recent call last):
>   File "", line 1, in 
>     map(pow, [1, 2, 3], [2, 3, 4, 5])
> TypeError: unsupported operand type(s) for ** or pow(): 'NoneType' and 'int'  
>  
> Since I’m using Python 2.7, I have omitted the list terms in these examples.
> 
> If I make the second list either shorter than, or longer than, the first I 
> get the error, which suggests that the function does not automatically stop 
> when it reaches the end of the shorter iteration. However, the book example 
> indicates that it should automatically stop at the end of the shorter list 
> iteration.
> 
> Could you explain why the book example seems to tolerate unequal list lengths,
> but my code does not?
>


[from the errata page's post:]

A reader wrote to ask why this example on page 617 of Chapter 20:

   >>> list(map(pow, [1, 2, 3], [2, 3, 4, 5]))   # N sequences: N-ary function
   [1, 8, 81] 

works on Python 3.X, but fails in 2.X.  This reader later withdrew the query, 
after finding the earlier 2.X map() coverage which notes its None padding when 
argument lengths differ (by contrast, 3.X's zip() and map() both truncate).  
This earlier coverage is on page 408-409, in Chapter 19's section "map equivalence 
in Python 2.X."

In hindsight, though, the Chapter 19 section (and related material later in 
Chapter 20) is perhaps not as clear about the 2.X/3.X differences in the map() 
call as it could have been.  Really, the padding with None always occurs in 2.X, 
_irrespective_ of the function argument passed in.  In full detail:


In Python 3.X, map() always truncates at the shortest argument's length, 
and a real function is expected in its first argument:

   C:\...> py -3
   >>> list(map(pow, (2, 3), (1, 2, 3)))     # 2**1, 3**2
   [2, 9]

   >>> list(map(pow, (2, 3, 4), (1, 2)))     # 2**1, 3**2
   [2, 9]

   >>> list(map(None, (2, 3), (1, 2, 3)))
   TypeError: 'NoneType' object is not callable


In Python 2.X, map() always pads shorter arguments with None, regardless of 
whether a real function or None is passed — which can lead to errors for 
functions that don't expect the None:

   C:\...> py -2
   >>> list(map(pow, (2, 3), (1, 2, 3)))     # 2**1, 3**2
   TypeError: unsupported operand type(s) for ** or pow(): 'NoneType' and 'int'

   >>> list(map(pow, (2, 3, 4), (1, 2)))     # 2**1, 3**2
   TypeError: unsupported operand type(s) for ** or pow(): 'int' and 'NoneType'

   >>> list(map(None, (2, 3), (1, 2, 3)))
   [(2, 1), (3, 2), (None, 3)]


This is why page 617's "list(map(pow, [1, 2, 3], [2, 3, 4, 5]))" works 
in 3.X (as shown) but fails in 2.X (as not shown): on 2.X, the last 
function call runs "None ** 5" and fails.

In addition to 2.X map() coverage on page 408-409, this 2.X behavior is
strongly implied by the manual zip() and map() implementations that 
immediately follow in Chapter 20.  Still, the extension to a real function
argument isn't stated explicitly in either location.

Also note that the 2.X-flavor map() implementation in Chapter 20's section 
"Coding your own zip(...) and map(None, ...)" is really just that of 2.X's 
map(None,...), as it does not apply a function to paired items (though you 
could easily extend it to do so).  This example primarily implements a 
zip() with padding.

For more on map() in Python 3.X and 2.X, see also Python Pocket Reference 
(a supplement to Learning Python), as well as Python's own standard 
library manual.  As a tutorial, Learning Python occasionally sidesteps 
some obscure or dated fine points on purpose; this qualifies on both counts
(mapping functions on differing-length iterables in 2.X seems rare in the 
extreme), but the book example's potential to confuse merits the patches.


What Does "in" Do When Applied to a File?

[Jan-2015] A reader wrote O'Reilly's book support people asking about an example that applies the in membership operator to an open file object. I've seen this cause confusion before, so I'm posting the query and reply here; it's also an excuse to show an arguably powerful use case for the any() built-in, and preview the re pattern-matching module.

> -----Original Message-----
> From: BookTech 
> To: "lutz@rmi.net" 
> Subject: FW: Learning Python 5th Edition query
> Date: Mon, 5 Jan 2015 23:57:57 +0000 (GMT)
> 
> Hi Mark,
> Would you be able to assist me in answering this customer's question about 
> your book please?
> 
> --------------- Original Message ---------------
> From: ...
> Sent: 1/2/2015 2:38 AM
> To: bookquestions@oreilly.com
> Subject: Learning Python 5th Edition query
> 
> Hi,
> 
> I'm currently working my way through the above mentioned book and I'm a
> little confused by the results produced in one of the examples.  On page 431
> the author gives us a list of examples in which we can see the iteration
> protocol at work on a file and all of them work just fine except the
> membership test and although this makes sense to some degree, the author
> clearly expected it to work.
> 
> I'm using my own file but I know one of the tests should produce a True
> result but it doesn't.  I've now tried this in both Python v2.7.6 and v3.3
> but I always get a False result.  
> 
> I've attached the file I'm reading from and copied and pasted my own two
> forms of the example below;
>
> print('blue' in open('test.txt'))
> print('brown' in open('test.txt'))
> 
> As I've said I was not overly surprised that the test fails because this
> example isn't really asking for the file to be read, it is a
> straight-forward test and the object it is testing again isn't a string, so
> I would have expected an exception to be raised instead.
> 
> I imagine there are other ways to perform the test but I'm curious to know
> why this just produces a False.
>


Sure; this is actually simpler than the reader might expect.
Look carefully at what the "in" tests in the book are doing:

   >>> 'y = 2\n' in open('script2.py')      # Membership test
   False
   >>> 'x = 2\n' in open('script2.py')
   True

These are testing for the presence of an entire line, not a 
word within a line.  That's why there is an explicit '\n' at the 
end of the test strings.  It must be so,  because the file object 
iterator returned by open() and activated by the "in" iterates 
through full line strings, not individual words or characters.

Hence, in the reader's case and test file, a test for an 
individual word will never be True, but testing for a full 
line (as in the book) does work in both Python 2.X and 3.X:

   C:\...> type test.txt
   The quick brown fox
   jumped over the lazy hen
   Mary had a little lamb
   its fleece as white as snow

   C:\...> py -3
   >>> 'brown' in open('test.txt')
   False
   >>> 'The quick brown fox\n' in open('test.txt')
   True
   >>> 'The quick blue fox\n' in open('test.txt')
   False


If you really want to test for individual words within
lines, you might try an outer for loop that scans lines,
prints affirmative on a match, and negative in the else
(run if the loop didn't hit a break):

   >>> for line in open('test.txt'):
   ...     if 'brown' in line:
   ...         print('yes')
   ...         break
   ... else:
   ...     print('no')
   ...
   yes
   >>> for line in open('test.txt'):
   ...     if 'blue' in line:
   ...         print('yes')
   ...         break
   ... else:
   ...     print('no')
   ...
   no


Alternatively, you might apply the "in" to each line with 
a list comprehension, and run the result through the any()
built-in to see if any came out True:

   >>> [('brown' in line) for line in open('test.txt')]
   [True, False, False, False]
   >>> [('blue' in line) for line in open('test.txt')]
   [False, False, False, False]

   >>> any([('brown' in line) for line in open('test.txt')])
   True
   >>> any([('blue' in line) for line in open('test.txt')])
   False


But if you've gone to that much bother, a generator
expression saves having to type the square brackets
(and building a list of Boolean results in memory):

   >>> any(('brown' in line) for line in open('test.txt'))
   True
   >>> any(('blue' in line) for line in open('test.txt'))
   False


Caveat: this is a bit inaccurate, as the search for "brown"
will also report true for a "browning" in the file, which 
may or may not be what you want.  To look for whole words 
only, you might first split line strings on delimiters such 
as whitespace:

   >>> 'The quick brown fox\n'.split()
   ['The', 'quick', 'brown', 'fox']

   >>> any(('bro' in line) for line in open('test.txt'))
   True
   >>> any(('bro' in line.split()) for line in open('test.txt'))
   False
   >>> any(('brown' in line.split()) for line in open('test.txt'))
   True

On the other hand, splitting precludes searching for arbitrary
line substrings:

   >>> any(('brown fox' in line) for line in open('test.txt'))
   True
   >>> any(('brown fox' in line.split()) for line in open('test.txt'))
   False


Finally, if the file being processed is small enough to fit
into memory, it's both quick and easy to load it into a 
single string with the file.read() method, and apply "in" 
to do substring search — which is close to the original 
intent, but applied to the full file, not its line strings
(and will fail for pathologically large files):

   >>> 'brown' in open('test.txt').read()
   True
   >>> 'blue' in open('test.txt').read()
   False

   >>> 'bro' in open('test.txt').read()
   True
   >>> 'bro' in open('test.txt').read().split()
   False
   >>> 'brown' in open('test.txt').read().split()
   True


As a postscript (and preview of a topic in the domain of the 
book Programming Python): splitting on whitespace may not
suffice if the file contains punctuation characters; a comma
immediately following a word is enough to throw this scheme off. 
If such is your text, you can still perform whole word matches
by splitting on a set of alternatives with Python's re pattern 
matching module to be more general.
 
Here's the case for one line; apply this within a line iteration
or after a full text read as appropriate; precompile the pattern 
string for speed as needed; and see the mentioned book, the book 
Python Pocket Reference, or Python's library manual for more 
details on this module:

# whitespace splitting: some words missed

   >>> line = 'Brown,green; light-blue,  red.  And orange!'

   >>> line.split()
   ['Brown,green;', 'light-blue,', 'red.', 'And', 'orange!']

   >>> 'green' in line.split()
   False

# smarter splitting: 1 or more of any in the [] set, \s=whitespace

   >>> import re
   >>> re.split('[,;.!\-\s]+', line)
   ['Brown', 'green', 'light', 'blue', 'red', 'And', 'orange', '']

   >>> 'green' in re.split('[,;.!\-\s]+', line)
   True

# ignoring case

   >>> 'brown' in [s.lower() for s in re.split('[,;.!\-\s]+', line)]
   True

# using dashes or not, generator expr

   >>> 'blue' in (s.lower() for s in re.split('[,;.!\-\s]+', line))
   True
   >>> 'blue' in (s.lower() for s in re.split('[,;.!\s]+', line))
   False

# applying to all lines in a file

   >>> import re
   >>> print(open('test2.txt').read())
   The quick, brown, fox
   jumped over the lazy hen
   Mary had a little lamb
   its fleece as Blue-White as snow

   >>> any( ('blue'  in re.split('[,;.!\-\s]+', line)) for line in open('test2.txt') )
   False
   >>> any( ('brown' in re.split('[,;.!\-\s]+', line)) for line in open('test2.txt') )
   True

   >>> any( ('blue' in (s.lower() for s in re.split('[,;.!\-\s]+', line))) for line in open('test2.txt') )
   True
   >>> any( ('blue' in (s.lower() for s in re.split('[,;.!\s]+', line))) for line in open('test2.txt') )
   False


Cheers,
--Mark Lutz (http://learning-python.com)


"Programming Python" Content and Status

[Feb-2015] A LP5E reader wrote with questions and suggestions for both Learning Python and Programming Python. As the latter is an applications-level follow-up to the former, this is relevant to both Learning Python readers and this page. Some of the replies pasted below may help explain the purpose and goals of Programming Python, as well as constraints inherent in writing such books.

> -----Original Message-----
> From: ...
> To: lutz@learning-python.com
> Subject: Information regarding Programming Python 5th ed.
> Date: Mon, 09 Feb 2015 15:18:41 +0000
> 
> Hi Mark,
> I am a student from India and have been following your book Learning Python
> very closely and its helping me a lot. Since I'm about to finish Learning
> Python and very excited to jump into "Programming Python", I have a few
> questions regarding the book Programming Python :
> 
>    [...specific questions quoted below...]
> 
> Thanks for your time.
> Regards,
> ...
> 


Hello,

Thanks for your note, and best wishes with the books and Python.
Responding to your queries:


>    1. Is the 5th edition going to be released in the near future? Because
>    4th edition was in 2010 and there have been significant changes in python
>    since then.


No, there is no 5th edition planned for this book today.  I don't
believe there will be one in the next few years, if at all.

In general, this book is a tutorial on getting started in common 
application domains - the web, GUIs, systems, text, databases, and 
so on.  It teaches these domains' fundamentals that span, and are 
prerequisite to using, more specific tools.  Its CGI coverage, 
for example, lays the Web scripting groundwork needed to understand
and properly use more advanced frameworks such as Django.  As such, 
this book's material is not out of date with, and is even largely 
immune to, the latest twists and turns of the software field.

One more specific note: this book's examples are known to work under 
Python 3.3 and 3.4 [edit: and later, 3.5] with only minor patches, 
and so are as current as they can be.  For more on this, see the 
following page, especially if you purchase the book for use under 
the latest Python 3.X:

   README-PP4E-PY33-PY35.html


>    2. If 5th edition is underway, does it include topics that are new to
>    python3 like asyncio etc.


Per the prior point, there is no 5th Edition underway.
The asyncio module would be a prime new topic, of course, 
but it's a bit unproven given its age, and there is ample 
coverage of related parallel processing topics in the book. 
Again, the book stresses fundamentals underlying modules 
like asyncio, not just API details of specific libraries.
This is especially true for newly emerged tools.


>    3. Current edition (4th) of Programming Python explains about CGI
>    scripts for web-based programming, are there any plans to include stuff
>    like WSGI, and werkzeug  / other
>    REST API based frameworks.


No, again per the first point.  This book covers general 
fundamentals that span tools, rather than trying to cover the 
latest popular tools — which, time has shown, often have a 
heyday that is not as long as the shelf life of a book.


>    4. Learning Python mentions about coroutines in Generators section while
>    talking about the "yield" statement, however there is no further discussion
>    of coroutines in later chapters or Programming Python. Is this topic
>    included in next version given the popularity of coroutines in libraries
>    like "gevent" and "tornado".


I appreciate the suggestion.  There is no Learning Python update on
the drawing board today, and won't be for years.   This book is just 
1.5 years old today, after all; if there is an update, it most likely 
won't be until 2017 and Python 3.6, given the book's normal updates
cycle.  That said, more on coroutines might work (especially use of
the latest "yield" extensions), if its audience is large enough to 
justify the growth. 

[Update: see also here and here.]


>    5. Lastly as a feedback, I would humbly request you to add an (optional)
>    section about a guide to contributing to python opensource project and
>    briefly explaining how the important files in the source result into a
>    minimalist python (basically which important files do what) and how to use
>    debugger to find the file related to a particular bug. This could be
>    important for the python community as Guido Van Rossum has constantly been
>    discussing about the need to get more people in the python core development
>    and also given that many college students are now interested in
>    contributing to python who have relatively lower experience/understanding
>    about design of programming languages and compilers.


All good ideas as well (and some of which I've addressed in earlier 
books).  Unfortunately, this level of detail tends to change too 
frequently to codify in books that may be around for a decade or 
more.  My general policy is that "ecosystem" topics like source code 
structure, PyPi, and development procedures are best addressed on 
the web, were they're much more easily updated than in books. 
PyPi, for example, did not exist when earlier editions were 
published, and could be subsumed by other tools in the future.

Still, these are all useful topics; thanks for the suggestions,
and again, best wishes with Python.

--Mark Lutz (http://learning-python.com)


Hiding Standard Modules Can Make IDLE Fail Too

[Apr-2015] A reader wrote with an IDLE usage note that underscores some of the subtlety of module search paths:

> From: ...
> To: lutz@rmi.net
> Subject: Question...
> Date: Mon, 13 Apr 2015 17:36:20 -0700
> 
> I'm working my way through "Learning Python," 5th edition. On page 724
> there's some code that involves creating a "string.py" in the main
> ("c:/code") directory.
> 
> With the basic Python 3.4 distribution, this doesn't seem to work - when I
> create a file by that name, I can't get the Idle shell to open. It runs as
> expected in Python from command line, but this is significant enough that I
> thought it should be mentioned in future editions...


Thanks for your note; I'll consider adding a footnote on this in the 
future.  This is a bit grey, because:

- It's somewhat implied by this section's coverage (it demonstrates 
  a module in CWD hiding one in the std lib, which is just what happens 
  to IDLE if it's run in this CWD).

- The example works as run in the book (it has explicit command-line 
  prompts to give its usage mode). 

- Most readers probably aren't having the issue, because they are 
  launching IDLE by clicks instead of command lines: other launch modes 
  won't run IDLE in the CWD where string is redefined.  In fact, this 
  is why the examples must be run from a command line instead - IDLE won't 
  see your CWD if clicked in a file explorer.  Getting IDLE to see your 
  CWD via command line triggers the issue the section demos.

That is, IDLE is a Python program and follows the normal module
lookup rules for the tools it requires.  As the current directory 
is searched first, a string.py there can break IDLE, but only if
IDLE is launched in that directory with a command line.  Clicking 
to launch runs IDLE in a different directory, and without problem.

Still, this could be useful, and even informative, to note.

--Mark Lutz (http://learning-python.com)


Running Scripts 2: General Pointers, Using "cd"

[May-2015] A reader wrote with confusion on launching Python scripts; this is beginner-level material, but may be common enough to warrant a paste here.

Update, Feb-2018 If asked today, I'd add to the reply below that the PyEdit program available freely on this site is a lightweight alternative to IDLE for editing and running Python code, especially for users just getting started with Python. Grab the app or executable for your desktop platform and experiment to see how. As of 2019, PyEdit's source-code package can also be used to edit and run your Python code on Android devices (if you're brave enough to try). There's more about running the book's examples on mobile devices in general at this note.

> From: ...
> To: lutz@learning-python.com
> Subject: sorry but I am stuck
> Date: Sat, 2 May 2015 15:14:37 +0000 (UTC)
>
> Hello, 
> I'm new to programming and stuck in the first concept.  I bought the book through B&N. 
> I created a file in python, then saved it in a new folder on the c drive, just like 
> you explained in the book.  I tried it both ways with python interphase (GUI) and with 
> notepad (saving it with the py extension) and none of them work.  When I go to the 
> command line (it is already in python >>>) and type 
> c:\code\script1.py 
> it says that there is a syntax error 
> it seems like the error is in the : 
> I spent about two hours looking for a solution online.  Other people seemed to 
> have the same issue but no response.  I downloaded 2.7. 
> Thanks 


Have you worked this out yet on your own?  There are ample
resources in the book on running programs, but the first 
steps can be challenging for those new to programming.

Unfortunately, it's difficult to assist without seeing 
exactly what is failing for you.  One specific note: 

> When I go to the command line (it is already in python >>>) 
> and type c:\code\script1.py 

In this case, it appears that you are attempting to run a 
system shell command (to launch a program) at the Python 
prompt.  Per the book, that never works — you can type only 
Python code at the Python prompt.  You need to type and run
the "c:\code\script1.py" command from a basic system shell,
not from a Python session.  On Windows, that means it must be 
typed in a Windows Command Prompt window without starting the 
">>>" Python session.  That is:
  1. Open a Command Prompt window via Windows Start or Run menus
  2. Run system command "c:\code\script1.py"
Or, if the prior doesn't work due to bad filename associations:
  1. Open a Command Prompt window via Windows Start or Run menus
  2. Run system command "python c:\code\script1.py"
Or, if the prior doesn't work because Python isn't on your PATH (as described in an appendix in the book):
  1. Open a Command Prompt window via Windows Start or Run menus
  2. Run system command "c:\python27\python c:\code\script1.py"
To avoid typing the script's path, cd (change directory) to its folder first; in system commands, script names without their full paths are taken as relative to the current directory:
  1. Open a Command Prompt window via Windows Start or Run menus
  2. Run system command "cd c:\code" to go to the script's directory
  3. Run system command "python script1.py"
I suspect you were missing the "cd" command in this procedure. For example: ...open Command Prompt... C:\Users\you> cd c:\code c:\code> python script1.py win32 4294967296 Spam!Spam!Spam!Spam!Spam!Spam!Spam!Spam! c:\code> For more on the "cd" system command, try a "help cd" in a Windows Command Prompt, or http://en.wikipedia.org/wiki/Cd_%28command%29. Other Options. The preceding deals with running scripts from a command line, but, as covered in the book, there are other launch options. Alternatively, you can type an "import script1" Python command at its ">>>" prompt, but this works only if the window where the ">>>" appears is running in the "c:\code" directory (else Python cannot find your file in the current working directory). To use this technique, you must:
  1. Open a Command Prompt window
  2. Run system command "cd c:\code" to go to the script's directory
  3. Run system command "python" to start the ">>>" Python session
  4. Run Python command "import script1" (with no ".py") to run your script
This runs the file as it was when Python started up; to see changes you've made to the file, you may need to reload() or restart the Python session. Finally, you can run the script from IDLE and skip command lines altogether, using IDLE's pulldown menu options (IDLE also changes to the script's directory automatically):
  1. Start IDLE via Python menu in the Start menu (or other technique)
  2. Open the script's code in IDLE via "File->Open" in IDLE's main window
  3. Run the script's code via "Run->Run Module" in the newly opened window
You may also run a script by clicking its icon, but this fails if there are errors in the code; IDLE or command lines work better. All of this is covered in the book, so I encourage you to reread the early chapters if you're still having problems. If your script generates Python error messages when run, you have passed the first hurdle (it's running), but will need to make sure you copied its code exactly as shown in the book to avoid syntax errors from Python. --Mark Lutz (http://learning-python.com)


More on List-Based Matrix Summation

[Nov-2015] A student recently asked about computing element-wise sums of a list-of-lists matrix structure. The book covers most of this directly on page 113 (and uses sum() elsewhere), but the final step is added here. Its generator expression-based variant leverages the automatic iteration performed by the sum() built-in:

>>> M = [[1, 2, 3],                     # 3x3 2D matrix
...      [4, 5, 6],
...      [7, 8, 9]]

>>> M                                   # really a list of row lists
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]      

>>> M[1]                                # second row (in book)
[4, 5, 6]

>>> [row[1] for row in M]               # second column (in book)
[2, 5, 8]

>>> [sum(row) for row in M]             # sum of rows (in book)
[6, 15, 24]

>>> sum([sum(row) for row in M])        # sum of all items: sum of row sums
45

>>> sum(sum(row) for row in M)          # same, but via generator, not list
45


Loading Matrix Data from a Text File

[Nov-2015] A reader wrote asking how to load numeric data from a text file. Because this sort of basic file-processing task is common to a wide range of applications, it merits a post here. Like all replies on this page, this one hopes to address common queries (and not homework deadlines...).

Query:

> From: ...
> To: lutz@rmi.net
> Subject: Python Help
> Date: Mon, 26 Oct 2015 08:30:03 +0200
> 
> Hi Mark,
> 
> I am lost in Python, can you guide me please.
> 
> I got a protein distance matrix (you can check the attachment)
> How can I read this file and scale this value on a scale of 0 to 1 usign
> sigmoid function.
> 
> I know we can do this in Python like this:
> 
> import math
> def sigmoid(x)
>      return 1/(1+math.exp(-x))
> 
> but I don't know how can I read and change this file values?
> 
> Any help would be welcome, thanks!



Attached data file, distance_matrix:
    8
gi15801179  0.0000 42.4581 10.6714 39.6484 15.0681  9.0639 10.8328 16.3808
gi9967069  42.4581  0.0000 42.9834 10.7504 14.9194 14.6448 41.0313 10.1185
gi15925280 10.6714 42.9834  0.0000  5.9973 12.5600 40.5210  5.8560 27.7503
gi12313641 39.6484 10.7504  5.9973  0.0000 15.4373 40.4623  2.3851 34.7955
gi14719485 15.0681 14.9194 12.5600 15.4373  0.0000 12.4809  8.8614 27.0177
gi4758426   9.0639 14.6448 40.5210 40.4623 12.4809  0.0000 42.7092 20.1177
gi6633958  10.8328 41.0313  5.8560  2.3851  8.8614 42.7092  0.0000 27.7887
gi21730171 16.3808 10.1185 27.7503 34.7955 27.0177 20.1177 27.7887  0.0000


Reply:

I don't understand your application's goals, of course, but you can 
parse and load the data file with code of the following sort.  Given
that the data is whitespace-delimited text, the trick is to split and 
convert, while reading line by line.

To update a text file like this, you'll probably write or print lines 
to a new version of the file as shown, with a space or tab ('\t') 
between each number.  Binary files imply different loading techniques,
and can be updated in-place instead via file seeks.


# file scan.py

data = open('distance_matrix')                   # open data in input mode
newdata = open('new_distance_matrix', 'w')       # open results output file

numcols = int(data.readline().strip())           # line 1, less blanks, str->int
print(numcols)

for line in data:                                # for each line left
    cols  = line.split()                         # split on whitespace
    label = cols[0]
    vals  = [float(text) for text in cols[1:]]   # strings -> numbers
    print(label, '=>', vals)

    for (ix, val) in enumerate(vals):
        # here, you may want: vals[ix] = sigmoid(val)?
        # not sure of the purpose of your numcols line
        # enumerate() makes range(numcols) superfluous 
        pass

    # write results: list -> text
    newdata.write('\t'.join(str(v) for v in vals) + '\n')


--Mark Lutz (http://learning-python.com)


More Fun with map+lambda: Nested Loops

[Jun-2016] A reader wrote asking for comments on some code written to perform a nested loop with a map()/lambda combination. This is an extension of the map() coverage in the book—which stops short when the nesting becomes complex enough to qualify as cruel and unusual punishment. Nevertheless, because other readers might be interested in the subject, I've posted the code of my reply giving a handful of alternative solutions here:

     map-nesting-example.py

There certainly are additional alternatives, and some might even be fun to play with, but such code already pushes the envelope on readability enough to merit a stop here. Excerpted text from the reader's email:

Hello sir, I am currently reading your book learning python(5th ed.) I
was stuck at the part where using only map one can implement nested
loops. I came up with a solution that works for 2 nested levels, i.e.
for the same test case given in the book.

...[see code in linked file above]

I know the above might be bad code. I have attached an image that may
be used for more than 2 nested levels(N levels). Could you kindly tell
me if I'm wrong and how should I correct it. I haven't checked it yet
for more than 3 levels.

Thank you for this wonderful book. Your book has made me realize and
appreciate the beauty of the python language and it's design.


On "The Shallows" and Wake-Up Calls

[Apr-2015] A reader wrote with observations and questions about The Shallows, a book mentioned in Learning Python's Preface (see the earlier post). As this seems of general interest, the trimmed query and my slightly edited reply are below.

> From: ...
> To: lutz@learning-python.com
> Subject: The Shallows
> Date: Fri, 20 Mar 2015 10:44:12 -0500
> 
> Mr. Lutz,
> 
> I'm working my way through your *Learning Python* book, 5th Edition. In
> your preface you gave thanks for the book *The Shallows* because of its
> wake-up call in your life.
> 
> Because of that comment, I bought and read the book *The Shallows,* and I'm
> curious about how this book affected you. It seems that Python forms part
> of "the system" called the Internet that is slowly eroding away at the
> intellectual aspects of life in the modern world.
>
> [...]
> 
> How did that book give you a wake-up call? Can we study technology (like
> Python or networking principles) and still maintain that "depth of thought"
> Carr says is disappearing in our culture because of technology?
> 
> Maybe I answered my own question when I bought the paper copy of *Learning
> Python* and not the electronic copy.
>
> [...]
> 
> Thank you for your book. It's the best I've found on Python.


Thanks for your note, and I'm glad you found both books useful.

For me, the wakeup call of "The Shallows" was a reminder that I've
worked for 3 decades in a field that's great at asking the "how"
questions, but often lousy at the "why".  That's left us with devices 
and an Internet that have changed society in titanic ways in a 
very short amount of time — and nobody in the field seems to 
have asked whether this is a good thing.  

It may be too late for such questions, given the leveling of entire
industries that the web has wrought; and it may be unrealistic given 
the money that's flooded the field.  But it's my hope that we proceed
with more forethought than in the past.  Not only may there be adverse
cognitive consequences per "The Shallows", but the social and political
implications of online lives are enormously perilous.

As for Python's role: yes, it was there at the start and remains a 
major player.  It was used for Google's first 1996 web spider, is behind
YouTube and Dropbox, and is at the core of much data analysis.  The 
latter of these today often takes the form of web tracking, which has 
unfortunately spawned an entire economy based upon violating personal 
privacy for profit.  That's about as ugly as it can be.

I suppose those of us who contributed to the rise of the web are akin 
to the physicists whose theoretical work led to atomic weapons; a 
toolmaker isn't responsible for the creations of the tool, but does 
have some ethical obligation to sound the alarm when it is misused.

As examples, see my recent notes on cloud storage and privacy issues:

  Cloud tradeoffs

  Calendar snooping

  Email snooping

And a later sampling:

  Smartphone agendas

  More on smartphones

  Web privacy

  Advertising foo

  Online learning

  Crowd-sourced knowledge

  Etcetera

Which is not to say that either the Internet or Python are all bad. 
To be sure, Python programming can instill the very depth of thought
whose demise "The Shallows" laments.  But something with very large 
impacts should come with very large cautions.

Best wishes with Python,
--Mark Lutz (http://learning-python.com)


A Basics Potpourri: Returns, Constructors, and "self"

[Nov-2018] A reader wrote with some basic-level queries about function return values, class constructor methods, and the need for a "self" in class method functions. These topics are all covered in quite a bit of depth in the book, but in case it might help anyone else out there with similar puzzlings, the reply is pasted below.


...original request trimmed for space...


Below are replies to your queries (courtesy of a rare bit of 
free time).  [...content trimmed...] I hope the following helps
clear up some normal beginner's confusion, and best wishes with 
your Python learning curve.

--Mark Lutz


-----------------------------------------------------------------
About function return values:

First off, it's important to note that every function returns 
something - which, in Python-speak, means an object.  If a 
function has no "return," it just means that the function 
returns the default "None" object when control falls off the 
end of the function's body and control goes back to the caller.

Secondly, functions that compute an output value send it back
to the caller explicitly with a "return x" statement, which means
exit this function now, and send object "x" to the caller as 
the result of its function-call expression.  The caller may 
then print the object thus sent back, or use it for arbitrary 
purposes of its own.  For example:

def compute(x, y):     # accept two input objects
    return x * y       # create and return an output object

t = compute(2, 3)      # caller: t gets assigned to the 6 returned
s = t ** 2             # use the result object here: s is 36
print(t)               # another way to use the 6 result here

If a function does not have a result to send back, it is
sometimes informally called a procedure, because it's just a
package of instructions to execute as a set.  Such functions
might not need a "return" and are generally called from a 
statement that doesn't save a result; but they technically
do send back a "None" result by default anyhow:

def dothings():
   ...run statements for some purpose, no "return"...

dothings()             # functions with no "return" are just called

x = dothings()         # but they do have a return value too:
print(x)               # it's the default return value, None

Now, your example of a function like this:

def compute(x, y):
    print(x * y)

is really a procedure, because it doesn't send back an explicit
result.  Really, it's just displaying a result on standard output
(usually, the console window) instead of returning it, and that's
not as useful as it might be: it's generally better to return the
result and let the caller decide what to do with it:

def compute(x, y):
    return x * y      # sent here

t = compute(x, y)     # received here: print, use, or not

Importantly, there is no local or global variable associated with 
a function's return value (unlike some other languages, where the
function name serves to hold the result).  In Python, the return 
value is an _object_, not a name, and comes back to the caller 
automatically as the result of the function-call expression.  In 
the following, the variable "z" is irrelevant to the result; it's 
a reference to an object temporarily, but the object it references
is what is sent back to the place where the call occurred:

def compute(x, y):
    z = x * y
    return z          # return the object z references

t = compute(x, y)     # receive that object here

In principle, functions can also communicate output results by 
storing them in global variables, but this is extremely bad 
practice: because there's only one instance of a global name,
such functions would require callers to know where to look for
results, and could not easily support more general usage 
patterns.  For instance:

def compute(x, y):
    global result
    result = x * y

compute(x, y)
...display or otherwise use "result"...

This works in limited ways, but falls apart for larger sets of
general-purpose tools that may be called from multiple locations
in a program.  For instance, if a function A calls B, which calls
C, which calls B again, the latest assignment in B's code to a 
global "result" overwrites all other results - even for other 
calls that have not yet finished. 

Globals also are subprime for functions that use recursion, 
calling themselves to loop.  Recursion is an advanced technique,
but works because function return values are stacked per the order
of calls made:

def compute(x, y):
     if y == 0:
         return 0
     else:
         return x + compute(x, y-1) 

print(compute(2, 3))   # it's 6 too, but may make your head explode

Recursion is more useful for arbitrarily shaped problems (e.g., 
crawling a website's pages), but is crucial in some use cases.


-----------------------------------------------------------------
About constructors:

No, constructors are not required.  Whenever a class is called,
Python follows two steps: it first makes an empty instance of the
class, and then passes the new instance on to the inherited __init__
along with any arguments passed to the class at the class call. 
 
If no __init__ is found, the second step is simply skipped, 
and the instance will initially have no attributes.  The first
step, instance creation, is technically performed by a method
named __new__, but this is a deep-magic hook that most Python
programmers will never deal with.  Usually, an __init__ is coded 
to fill out the data associated with a newly created instance:

class C:
    def __init__(self, args...):
        self.name = 'someone'
        self.job  = 'software'

I = C()
...here, code can assume I has both .name and .job


-----------------------------------------------------------------
About the "self" argument:

By nature, class methods are designed to process an instance 
of the class - fetching and changing the instance's attributes.  
Because classes serve to generate multiple instances, class
method functions need a way to know _which_ instance they are 
supposed to be processing.  For example:

class C:
    def method(args...):
        ...process an instance: but which one?...

I1 = C() 
I2 = C()             # make two instance of C

I1.method(args...)   # is this call for I1 or I2?

The "self" argument is the method's handle (reference) to 
the instance which is the subject of the method call, and 
the one to be processed by the method's code.  Python 
automatically passes the subject instance to "self" so 
the method knows what to act on:

class C:
    def method(self, args...):
        ...process self: I1 or I2...
        self.name = "..."
        print(self.job)

I1 = C()
I2 = C()

I1.method(args...)   # I1 is auto passed to "self" 
I2.method(args...)   # I2 is auto passed to "self"

Without self, methods would have no object to process, and 
a class would be just a package of simple functions - and 
not really 'object oriented' at all.

Other than this, you seem to have most of the classes/methods
story correct.  I'd encourage you to write some class-based
code to solidify the concepts; the book's exercises are a
good place to start, if you haven't already done so.


Defining Python: You Say Interpreted, I Say Scripting

[Feb-2019] A reader wrote to suggest it may be useful to define Python as an interpreted language, instead of a scripting language as done in the book's opening bits. There are reasons "scripting" may be more accurate today, but the "interpreted" contrast may also help readers coming from traditional compiled-language backgrounds.

> From: ...
> To: "lutz@learning-python.com" 
> Subject: Is Python a "Scripting Language?"
> Date: Thu, 28 Feb 2019 19:56:54 +0000
> 
> I think that you missed a key point when answering that question.  To 
> me, scripting has always been associated with something that is not 
> compiled.  A "script" is interpreted.  A "program" is compiled.
> 
> Maybe this is just a "historical" note.  I've been programming for 30 years.  

> Maybe this answer to your question isn't pertinent if you focus on the 
> current, modern programming environment.
> 
> Regards,


Good point.  To me, having worked in compilers in the distant past, compiled 
and interpreted are implementation techniques, and seem a bit too gray to use 
for language classification.
  
For example, the C language is a systems language, and not generally regarded 
as a scripting language; but it's been implemented with compilers, interpreters,
and other schemes somewhere between the two that compile to interpreted 
bytecode just like Python's mainstream version today.  Other languages like 
Basic and Pascal could also be had in both compiled and interpreted forms (see 
the latter's p-code).

In fact, Python itself is both compiled and interpreted today, as you'll see in 
Chapter 2 (if you haven't already).  The standard python.org installs compile 
source to bytecode which is interpreted by a VM; but other systems compile to 
JIT-compiler bytecode that's translated to machine code on the fly, or translate
Python code to C code that is then compiled to machine code (see the Cython hybrid).

In other words, the line between compiled and interpreted is perhaps more vague 
than it used to be, and more difficult to use as a classifier than roles.  
Hence, "scripting" to me seems more about ease of use and directing components.

The static-versus-dynamic typing distinction sometimes works as a dichotomy too,
but only because dynamic typing naturally lends itself to scripting roles; it
doesn't necessarily imply an implementation technique.

That said, interpreters do tend to foster flexibility, and I also hate it when 
authors seem to miss something that's just plain obvious to me, so please take 
all this with the usual your-mileage-may-vary disclaimer.
 
Thanks for your input,
--Mark Lutz, https://learning-python.com


Ebook and PDF Options: Status Update

[Jul-2019] The following reader email asks for clarifications on ebooks, PDFs, and new editions—frequently asked questions all.

> From: ...
> To: lutz@learning-python.com
> Subject: Inquiry about your ebooks
> Date: Sat, 13 Jul 2019 09:12:57 -0400
> 
> Hello Mark,
> Hope you're doing great!
> 
> [...] I am really curious to know more and interested in your
> books specifically 'Learning Python' and ' Programming Python'.
> 
> Actually I prefer to read *e-books* on my pc, so I wanna know a few points:
> 1- if I buy your book from Amazon, does it come with a pdf download link as
> well or not?
> 2- How can I find it would be the earliest revision (23rd revision)?
> 3- Do you have any plan to publish the 6th edition of your book? if yes,
> when?
> 
> I appreciate you so very much


Thanks for your note.  My replies:


> 1- if I buy your book from Amazon, does it come with a pdf download link as
> well or not?

I'm not positive, but suspect not.  Amazon sells a format suitable 
for use on their sanctioned devices only (the last I checked, it was 
a Kindle-specific format that requires an app for viewing elsewhere).

If you're looking for a PDF too, I suggest this retail site:

  https://www.ebooks.com/en-us/searchapp/searchresults.net?term=lutz+python

As of my latest information, that site gives you online access
and both PDF and ePub format downloads, but you should naturally 
verify this today.  Google Books has (or had?) a PDF and ePub too,
but its formatting looked a bit custom.


> 2- How can I find it would be the earliest revision (23rd revision)?

Unfortunately, the publisher no longer sells ebooks directly,
so I have no way to know which reprint retailers are distributing,
short of buying a copy myself.  I suspect that ebooks are updated
automatically after each reprint, but you'll have to contact the 
retailer directly to be sure.  The upside is that changes in recent
reprints have been relatively minor tweaks, and are listed on 
the book's errata page (sort it by date to see the latest):

  https://www.oreilly.com/catalog/errata.csp?isbn=0636920028154


> 3- Do you have any plan to publish the 6th edition of your book? if yes,
> when?

No, there are no plans for a 6th Edition (though this is naturally
subject to change in the future).  For a canned reply, see this:

  https://learning-python.com/about-future-eds.html

Best wishes,
--Mark Lutz, http://learning-python.com


Using Bytecode Files without Source Requires Renames in 3.X

[Mar-2019] A reader wrote asking for help on shipping programs as bytecode files without source. This still works as described in the book, but appears to have changed slightly in 3.X. The required 3.X bytecode-renaming step is outlined in the reply below (expanded for completeness on this page).

> From: ...
> To: Mark Lutz 
> Subject: SOS from [...]
> Date: Mon, 25 Mar 2019 18:57:07 +0900 (KST)
> 
> Dear Mark Lutz,
>  
> It is [...] whom you have helped several times for last a couple of years.
> Today, I am knocking on your email account as my final hope for a problem 
> that I uploaded to Stack Overflow and failed to get an answer for.
> 
> The problem is about the way to distribute a byte code without source code.
> Regarding the question, I found a paragraph from one of your books, Learning 
> Python 4th Ed.
> On page 534, you wrote that "In addition, if Python finds only a byte code 
> file on the search path and no source, it simply
> loads the byte code directly (this means you can ship a program as just byte 
> code files
> and avoid sending source). In other words, the compile step is bypassed if 
> possible to
> speed program startup."
>  
> The paragraph sounded the way that I have been searching for. But when I 
> tested it as:
> 1) compiled a A.py as py_compile.compile('A.py')
> 2) copied the generated A.pyc file to a different directory (path2)
> 3) changed the current directory to the directory as os.chdir(path2)
> 4) imported the module as import A
>  
> I have repeated several times of the procedure because for every time I only 
> got an error message that such a module does not exist.
> What did I do wrong? What did I miss? 
> Could you please help me out? Could you please describe the proper way to 
> distribute the bytecode without source code?
>  
> I wish you find a little time to answer my question.
> Thank you so much.
> Sincerely yours,


Running with just bytecode files still works, but seems
to require a rename in Python 3.X alone to strip the 
bytecode file's creator tag.  Here are the details.

--THE PYTHON 2.X STORY--

Like much, using bytecode files is simpler in 2.X.  Importing a 
source file makes a same-named bytecode file in the same folder:

 ~/Desktop$ cd dir1
 ~/Desktop/dir1$ ls
 printer.py
 ~/Desktop/dir1$ cat printer.py 
 def func(spam):
     print(spam)

 ~/Desktop/dir1$ py2
 Python 2.7.10 (default, Oct 23 2015, 19:19:21) 
 >>> import printer
 >>> printer.func(99)
 99
 >>> printer
 <module 'printer' from 'printer.py'>

Later imports load the same-folder bytecode file as expected:

 ~/Desktop/dir1$ py2
 Python 2.7.10 (default, Oct 23 2015, 19:19:21) 
 >>> import printer
 >>> printer
 <module 'printer' from 'printer.pyc'>

And importing with just the bytecode file elsewhere works:

 ~/Desktop/dir1$ ls
 printer.py	printer.pyc
 ~/Desktop/dir1$ cp printer.pyc ../dir2
 ~/Desktop/dir1$ cd ../dir2
 ~/Desktop/dir2$ ls
 printer.pyc

 ~/Desktop/dir2$ py2
 Python 2.7.10 (default, Oct 23 2015, 19:19:21) 
 >>> import printer
 >>> printer.func(88)
 88
 >>> printer
 <module 'printer' from 'printer.pyc'>

--THE PYTHON 3.X STORY--

Since version 3.2, Python 3.X stores bytecode files in a __pycache__ 
subdirectory, and names them with a tag to identify their creator
(as described in the book).  To use these files without source, you 
must rename them without their creator-tag suffix.  For instance, 
after deleting the 2.X bytecode files in both test folders, the 
first import makes the bytecode in 3.X as expected:

 ~/Desktop/dir2$ cd ../dir1
 ~/Desktop/dir1$ ls
 printer.py
 ~/Desktop/dir1$ ls ../dir2

 ~/Desktop/dir1$ py3
 Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25) 
 >>> import printer
 >>> printer.func(66)
 66
 >>> printer
 <module 'printer' from '/Users/blue/Desktop/dir1/printer.py'>

Later imports grab the bytecode from the same-folder __pycache__,
though the module's display text doesn't make this obvious; the 
".cpython-35" is the creator tag in this example, and may vary:

 ~/Desktop/dir1$ ls
 __pycache__	printer.py
 ~/Desktop/dir1$ ls __pycache__
 printer.cpython-35.pyc

 ~/Desktop/dir1$ py3
 Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25) 
 >>> import printer
 >>> printer
 <module 'printer' from '/Users/blue/Desktop/dir1/printer.py'>

But unlike 2.X, using the bytecode file elsewhere verbatim sans
the original source file fails:

 ~/Desktop/dir1$ cp __pycache__/printer.cpython-35.pyc ../dir2
 ~/Desktop/dir1$ cd ../dir2
 ~/Desktop/dir2$ ls
 printer.cpython-35.pyc

 ~/Desktop/dir2$ py3
 Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25) 
 >>> import printer
 ImportError: No module named 'printer'

And neither moving it to another folder's __pycache__ nor copying 
the entire original __pycache__ appear to help (in fact, imports 
also fail if source is simply removed, leaving just __pycache__):

 ~/Desktop/dir2$ mkdir __pycache__
 ~/Desktop/dir2$ mv printer.cpython-35.pyc __pycache__/
 ~/Desktop/dir2$ ls __pycache__/
 printer.cpython-35.pyc

 ~/Desktop/dir2$ py3
 Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25) 
 >>> import printer
 ImportError: No module named 'printer'

--THE 3.X RENAME FIX--

Instead, rename the bytecode _without_ its creator-tag part to use 
elsewhere, as in the following; notice that __pycache__ is optional,
and no source file need be present to use bytecode files so renamed:

 ~/Desktop/dir2$ rm -rf *
 ~/Desktop/dir2$ cd ../dir1
 ~/Desktop/dir1$ ls
 __pycache__	printer.py
 ~/Desktop/dir1$ ls __pycache__/
 printer.cpython-35.pyc

 ~/Desktop/dir1$ cp __pycache__/printer.cpython-35.pyc ../dir2/printer.pyc
 ~/Desktop/dir1$ cd ../dir2
 ~/Desktop/dir2$ ls
 printer.pyc

 ~/Desktop/dir2$ py3
 Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25) 
 >>> import printer
 >>> printer.func(55)
 55
 >>> printer
 <module 'printer' from '/Users/blue/Desktop/dir2/printer.pyc'>

This renaming seems required in 3.X, and very close to 
qualifying as a 3.X bug (and I don't have time to explore 
other ideas at the moment; the module import story has 
changed much and often in recent 3.X Pythons).

Tools that bundle bytecode for distribution in 3.X must 
either move and rename as shown, generate with compile() 
to skip files altogether, or package in a form that makes 
folders irrelevant (e.g., by extracting modules from a 
zipfile when they are referenced).

--Mark Lutz, http://learning-python.com


More on State Retention in Closure Functions

[Jun-2019] A reader emailed O'Reilly's book support asking for clarification on a closure-function example in the text. The arguably mind-breaking (but potentially useful) reply is copied below.

> From: ...
> Sent: 6/13/2019 12:28 PM
> To: bookquestions@oreilly.com
> Subject: RE: Learning Python - 5th Edition
> 
> Hello,
> 
> I have begun to read (and enjoy) the Learning Python book authored by Mark 
> Lutz (fifth edition). However, being a novice to Python, I am struggling to 
> understand one of the examples used by Mr. Lutz and was hoping to get a 
> deeper explanation as to why and how that example works in the manner 
> described. The example in question begins in the call-out box “Why You Will 
> Care: Customizing open” on page 539. I was hoping someone could explain how 
> the customized function, makeopen, retains the state of past arguments when 
> it is invoked two or more times.
> 
> When the function is first invoked with makeopen(‘spam’) and then the 
> customized enclosing def with “F = open(‘script2.py’)” the interpreter 
> responds with:
> 
> Custom open call ‘spam’: (‘script2.py’,) {}
> 
> The subsequent series of invocations yield the following:
> 
> >>> makeopen(‘eggs’)
> >>> F = open(‘script2.py’)
> 
> Custom open call ‘eggs’: (‘script2.py’, ) {}
> Custom open call ‘spam’: (‘script2.py’, ) {}
> 
> It is here where I simply don’t understand how state from the prior 
> invocation is retained. Why isn’t the ‘id’ parameter overwritten?
> 
> Any help that you can provide would be greatly appreciated.
> 
> Thanks in advance!


This is one of the subtler examples in the book, and the
confusion is understandable.  It does work as shown/described
(see the attached code and results), but takes a bit of tracing 
to fully grasp, and a look at the decorator closures in later 
chapters doesn't hurt.

In brief, an initial call to open() runs the built-in version,
which is really builtins.open, as usual (a raw "open" is looked 
up in the built-in scope, which is the "builtins" module): 
 
  F = open('script2.py')        # builtins.open is the original
  print(F.read())

After you first call makeopen(), though, builtins.open has been
reset to the result of the nested custom() def, which is a new 
custom function that retains the original builtins.open in its 
local variable "original".  Because of the reset, the reference 
to raw "open" now means the custom() function, which prints a 
trace message and calls the original open():

  makeopen('spam')              # builtins.open = a custom(),
  F = open('script2.py')        # which calls original builtins.open
  print(F.read())

  F = open('script2.py')        # ditto; open is still the custom(),
  print(F.read())               # because makeopen() not yet re-called

So far so good, but here's where the strange part comes up:
when makeopen() is called again, it also resets builtins.open
to a new custom().  But this new custom() retains and later 
calls the _current_ value of builtins.open in its "original"
-- which is the _first_ custom() created earlier, not the 
original open(), and not the new custom() just created.

That is, there are two custom() function objects in memory now: 
the new one created by this second makeopen() call, which 
retains the custom() object created by the first makeopen() 
call, which in turn retains the original open().  When the 
raw "open" is called now, it resolves to the new custom() in 
builtins.open, which calls the first custom(), which calls 
the original open(). 

Along the way, each custom() function object has its own "id" 
variable set to the string passed to the makeopen() call that 
created it, and each custom() passes along actual call arguments
to the next level.  Hence, both custom() layers print different
"id" tags ("eggs" in the first and then "spam" in the second), 
and the bottom layer gets the filename and opens it per the 
built-in behavior (the *pargs and **kargs bits just pass all
arguments along generically):

  makeopen('eggs')              # builtins.open = another custom(),
  F = open('script2.py')        # which calls current builtins.open,
  print(F.read())               # which is the first custom(), 
                                # which in turn calls the original;
  F = open('script2.py')        # and ditto if called again
  print(F.read())

All of which relies on the real magic of closures: each new 
function object has its own set of local variables that "remember"
state from the time of function creation, for use on later function
calls.  In this case, each custom() has its own "id" tag to print,
and "original" link to the prior value of builtins.open to call.
Per the sidebar, it's all quite implicit in functions, but more 
explicit (and code-y) with classes.

Complex but true, and sometimes useful for closure functions
that need to augment another layer of customization function.

Hope this helps,
--Mark Lutz, http://learning-python.com


About the References to "Binary" Integers

[Oct-2019] The book uses the term "binary" for the in-memory storage of integers in a handful of places, and this has proved to be a source of perennial confusion. To skirt the issue altogether, we've patched these in book reprints when reported by errata post or email, but there seems to be a deeper misunderstanding of the difference between in-memory storage and human-readable display, which merits a reply copy here.

> From: ...
> To: "lutz@learning-python.com" 
> Subject: Possible correction in the book Learning Python 5th Edition
> Date: Sat, 19 Oct 2019 16:23:25 +0530
> 
> Hello Sir,
> I am an avid reader of your book Learning Python 5th Edition from O’Rielly 
> and I found a small but important error in the book. Bellow is a snippet from 
> your book Learning Python 5th Edition:
>
> [PNG snipping attached]
> 
> Here, in the third line it is mentioned that the ord() function returns the 
> binary value used to represent the character passed into the function but 
> actually, it is returning the decimal value.
> 
> In addition to this I want to admit that your book is a really good one and 
> has so far helped me gain many insights that I have never had and it has also 
> helped me to learn python in a better way.
> 
> Thank You!


Thanks for your note.  About the reference to "binary value" 
on page 213: you have a good point.  In fact, another reader
reported similar uses of "binary" on pages 108, 246, and 1208,
which we patched in a Jan-2017 reprint to avoid confusion (see
https://www.oreilly.com/catalog/errata.csp?isbn=0636920028154).

The issue seems to be the subtle difference between internal 
storage and display formatting.  The book used the term "binary" 
at all these locations to refer to the numeric integer value 
stored in digital memory, which is always in binary form.  The 
ord() function indeed returns this binary value, but Python's 
interactive interpreter displays it in decimal form.  Hence, 
"binary" is meant to distinguish text from its numeric codes.

For instance, the numeric value 35 used to represent '#' in 
ASCII is an integer stored as binary bits 0b100011 in memory, 
and would be returned as such by ord() [the Python integer 
object for 35 is actually a bit more complex, but it's all 
binary data, and the bits are a close-enough approximation].
The fact that this value is displayed as decimal '35' at the
interactive prompt is really just an artifact of Python's 
human-friendly display logic:

  >>> ord('#')         # the decimal display of '#'s integer code 
  35
  >>> bin(35)          # what's actually stored in memory for '#':
  '0b100011'           # a numeric value encoded in binary form

  >>> chr(35)          # numbers can be coded as decimal or binary
  '#'
  >>> chr(0b100011)    # '35' is converted to these binary bits,
  '#'                  # which represent the textual glyph '#'

That said, "binary" has been a source of confusion, so we'll 
also adjust the occurrence you reported in the next printing, 
scheduled to occur next month.  Probably, the "binary" will 
simply be replaced with "numeric," because that differentiates
the code from its textual glyph just as well; it doesn't 
encroach on the numeric base used for display; and there's 
no space to go into the subtle shade of meaning intended.

Unless you prefer not, I will also post this note and reply
anonymously to the book's errata page, to help other readers
of prior printings.

Thanks for the report and best wishes,
--Mark Lutz, http://learning-python.com


Updated Table 5-2: Operator Precedence

[Nov-2019] A reader posted on the book's errata list in July 2019 to express confusion over the operator precedence implied by Table 5-2 on page 141 (137 in earlier reprints). In particular, the combination of the % and / operators seemed askew. The book was not in error, but the confusion understandably stems from the fact that precedence is represented by row membership in that table, but the table's rows were not very well distinguished as originally formatted. The slight extra whitespace was perhaps too subtle, and some rows should have been more inclusive.

In the table's defense, it was meant to be somewhat informal (a BNF spec would be overkill for this book); there have been no other complaints about its content for 6.5 years (and hundreds of thousands of reads); and unparenthesized operator combinations finessed by its first coding are unlikely to ever appear in the wild (you can code a < b == c in d is ~-+e, but probably shouldn't). Still, this table's original format was arguably subpar for readers seeking a thorough operator reference.

To address this, the November 2019 reprint improves Table 5-2's formatting by:

  1. Adding borders around this table's cells, to better set off its rows
  2. Coalescing a few formerly separate and adjacent rows, to boost accuracy

The combined effect more formally captures Python's operator precedence, and more closely parallels Python's docs.

All of which is easier to convey in pictures than words. For more context on this change, see the following screenshots of the table's before and after appearance, as well as the PDF rendition of the table's new formatting in the book:

And for more details on this change, and to view or save the updated table in simple text form, see this file:

As noted in that file, the reformatted table still lacks some formality, as well as recently added operators like @, :=, and await, not covered in this edition of the book. For the latest story, be sure to always check Python's standard docs, and its official (if technically heavy) grammar.


Python 3.7 Broke That Weird Iterator Example

[Nov-2019] This page earlier included a note of clarification for a subtle generator example shown in the book's sidebar Why You Will Care: One-Shot Iterations on pages 644-645 (621-622 in earlier printings). Regrettably, the book's correct 3.X version of the code described in that note no longer works as of Python 3.7, due to a major change in Python's handling of exceptions within generator functions.

In short, in 3.7 and later you must explicitly and manually catch StopIteration and run a return, instead of either letting the exception percolate and end the iteration, or raising StopIteration explicitly. Due to a quirk in the change, some programs may also need to catch RuntimeError.

You can read all about the change and its coding fix in this post on the Python Changes page. It's just one of a constantly growing set of ego-based Python mutations that slowly but surely break code in programs near you (e.g., see also the Related note at the end of the next item on this page).


Recodings from the Errata Page: reloadall3.py

[May-2020] A reader posted on the book's errata page to point out an issue on the reloadall3.py example in Chapter 25 (on page 792 in recent reprints). I responded to this on the book's errata page, but that page's format is unfortunately constrained and difficult to read for structured code. Because this page is arguably more legible, and because the errata makes for an interesting case study, I'm replicating the content here; you can also fetch all the code in the reply as a runnable script here.

Errata-page post:
>
> reloadall3.py seems to have a coding error.  I believe the idea is to reload 
> modules without duplicating any reloads.  As written it duplicate reloads.
> For example, when I run "python reloadall3.py os" the output I get is: 
> "reloading os, reloading ntpath, reloading genericpath, reloading stat,
> reloading stat, reloading sys, reloading stat, reloading sys, reloading abc" 
> Here is my suggested fix: change line from:
>   modules.extend(x for x in next.__dict__.values() if type(x) == types.ModuleType and x not in visited)
> to:
>   modules.extend(x for x in next.__dict__.values() if type(x) == types.ModuleType and x not in visited and x not in modules)


You're right (and great catch! - I wish this had been reported earlier).  
As coded in the book originally, reloadall3 can indeed reload a module 
more than once, because it checks only modules already visited, not those 
currently scheduled to be visited on the stack.  Because a module is never
again checked for prior visits once it's scheduled for reloading, modules
that are rescheduled before they are reloaded will be reloaded redundantly.
Though uncaught in testing and somewhat dependent on ordering, this can 
happen anytime a module is imported by multiple others:

  def transitive_reload0(modules, visited):           # 0: Original version
      while modules:
          next = modules.pop()                        # Delete next item at end
          status(next)                                # Reload this, push attrs
          tryreload(next)
          visited.add(next)
          modules.extend(x for x in next.__dict__.values()
              if type(x) == types.ModuleType and x not in visited)

  def reload_all0(*modules):
      transitive_reload0(list(modules), set())

Your proposed solution works, because it checks the already-scheduled 
stack too, before extending it.  This prevents the reschedules, and 
hence the duplicate reloads:

  def transitive_reload1(modules, visited):           # 1: Reader's working fix
      while modules:
          next = modules.pop()                        # Delete next item at end
          status(next)                                # Reload this, push attrs
          tryreload(next)
          visited.add(next)
          modules.extend(x for x in next.__dict__.values()
              if type(x) == types.ModuleType 
              and x not in visited and x not in modules)

  def reload_all1(*modules):
      transitive_reload1(list(modules), set())

If pressed, though, I'd say that it seems a bit gray to check for 
membership in a list while it is in the process of being extended.  
The generator passed to extend() yields one item to tack onto the 
list at a time - while also checking the contents of the list.  That
works, but seems very implicit (if not implementation dependent). 
I'd rather avoid the ambiguity and drama, and move the visited test 
up as follows, to trap repeats before their reload is attempted. 
This also _might_ be marginally faster because it trades list scans 
for set hashing, but benchmarking results are extra credit here:

  def transitive_reload2(modules, visited):           # 2: Avoid 'in' during 'extend'
      while modules:
          next = modules.pop()                        # Delete next item at end
          if next in visited: continue                # Already reloaded anywhere?
          status(next)                                # Reload this, push attrs
          tryreload(next)
          visited.add(next)
          modules.extend(x for x in next.__dict__.values() 
              if type(x) == types.ModuleType)

  def reload_all2(*modules): 
      transitive_reload2(list(modules), set())

Better still: the following coding is much closer in structure to the 
recursive reloadall2 version that precedes it in the book, and hence 
better illustrates the real recursive-versus-stack point of this section.
It also traps non-module arguments at the top level; neither the original 
nor the two other alternatives above do.  Hence, this is how this example
will be patched in future reprints of the book (see the errata page):

  def transitive_reload3(modules, visited):           # 3: Symmetry, catch non-mods
      while modules:
          next = modules.pop()                        # Delete next item at end
          if (type(next) == types.ModuleType          # Valid module object?
              and next not in visited):               # Not already reloaded?
              status(next)                            # Reload this, push attrs
              tryreload(next)
              visited.add(next)
              modules.extend(next.__dict__.values())

  def reload_all3(*modules):
      transitive_reload3(list(modules), set())

When tested, all three recodings fully avoid the duplicate reloads 
and produce the same results, though visitation order varies slightly 
as expected (you can run all this code live by grabbing a copy from 
https://learning-python.com/reloadall3.py):

  # prior API compatibility
  reload_all = reload_all3

  if __name__ == '__main__':
      # self-test
      import os, tkinter, reloadall3

      for test in (os, tkinter, reloadall3):
          print('\n%s\n[%s]' % ('-' * 40, test.__name__))

          for ra in (reload_all0, reload_all1, reload_all2, reload_all3):
              print('\n<%s>' % ra.__name__)
              ra(test)

Related note: when this example is run today, Python 3.8 and 3.7 
generate a deprecation warning that:
 
  "the imp module is deprecated in favour of importlib"  

Alas, Python's module API has been a moving target for some time now 
(per the book, reload() used to be a built-in function in 2.X), and 
3.X has a now-long history of arbitrary and opinion-based changes like 
this that rudely break existing code.  

In this case, reload() has been pointlessly relocated  _twice_ in 3.X, 
and this example will probably also require changing "imp" to "importlib" 
in the near future to avoid failing altogether with an exception.  This 
is too much change to patch in reprints, so this note will have to suffice.

On the other hand, recursion coding concepts illustrated by this example
are still as relevant to the art of programming as ever.  Small details 
like module names do morph, but the larger fundamentals presented by 
this book don't.


The Thing about Slices with Negative Strides

[May-2020] A reader posted to this book's errata page to note that the brief mention of negative-stride defaults for slices on page 210 isn't quite right. This refers to just one isolated clause (and the rules may have morphed along the way during Python 2.X's reign), but the real story behind the defaults is both complex and interesting, and worth copying here.

Errata-page post:
> 
> As from the book it is:
> You can also use a negative stride to collect items in the opposite order. 
> For example, the slicing expression "hello"[::−1] returns the new string 
> "olleh"—the first two bounds default to 0 and the length of the sequence.
>
> But it should be
> In the case of negative stride the first two bounds will be( length of the
> string minus one) and (-1). Though length of the string(instead of length 
> of string minus one) will produce the same result but the 0 as default value
> of second bound will prevent the slicing of first character of string and it
> will not appear in reversed string.


REPRINTS: please change the referenced clause from:
"""
the first two bounds default to 0 and the length of the sequence, as before, and
"""
to this:
"""
the first two bounds effectively default to sequence length-1 and -1 
(they really default to None and None, but that's unimportant here), and
"""
This is in paragraph 5 of page 210 (203 in earlier printings).

DISCUSSION: this crops up in just one clause of one sentence of a very large
text (and the rules may have changed over this title's lifetime), but the book's
description here is too loose, even with the clarifying test and related examples
that follow.  In truth, though, neither the book nor this post's suggestion is 
quite right about defaults for slices with negative strides, and the full story 
is far too much to cover at this point in the text.

But in short: omitted bounds in this context are not treated the same as default
values in others.  When strides are negative and bounds are omitted, they invoke
special semantics that are available only for omitted bounds, and are asymmetric
with other slice usage.  No explicit or default values work the same--except Nones.

This is an odd, special-case corner of Python which is of very little practical
consequence, and doesn't merit full coverage either here or in the text.  The 
briefest synopsis requires a look under the hood at the slice objects employed 
by slice expressions.  As described in the book, slice expressions are the same
as indexing with a slice object:

  >>> S = 'hello'
  >>> len(S)
  5
  >>> S[0: len(S): 1], S[slice(0, len(S), 1)]
  ('hello', 'hello')

Internally, a slice with just a -1 stride maps to the following; its (A, B, C) 
output is indices used to extract the slice as usual--from A, up to but not 
including B, by C:

  >>> slice(None, None, -1).indices(len(S))      # this is [::-1]
  (4, -1, -1)

But neither the post's proposal nor the book's handwaving work the same:

  >>> slice(len(S)-1, -1, -1).indices(len(S))    # not same as [::-1]
  (4, 4, -1)
  >>> slice(0, len(S), -1).indices(len(S))       # not same as [::-1]
  (0, 4, -1)

And no other integer defaults are treated the same as omissions:

  >>> slice(len(S), -1, -1).indices(len(S))      # stop made positive=end
  (4, 4, -1)
  >>> slice(len(S)+1, 0, -1).indices(len(S))     # start scaled to len()-1 max
  (4, 0, -1)

Among the asymmetry here: slice objects map an omitted stop of None to -1, 
but a literal -1 stop used in a slice expression is scaled to len()-1, the
rightmost item.  Hence, there is no way to code a numeric value that works
the same as an omitted value, and the notion of integer defaults is invalid
in this context:

  >>> S[::-1]              # right
  'olleh'
  >>> S[0:len(S):-1]       # not right (book)
  ''
  >>> S[len(S)-1:-1:-1]    # also not right (post)
  ''
  >>> S[len(S)+1:0:-1]     # also also not right (hmm)
  'olle'
  >>> S[None:None:-1]      # the only real equivalent
  'olleh'

All of which confusingly differs from the case for a positive stride, where
defaults 0 and sequence length are the same as both Nones and omission:

  >>> slice(0, len(S), 1).indices(len(S))
  (0, 5, 1)
  >>> slice(None, None, 1).indices(len(S))
  (0, 5, 1)

  >>> S[::1]
  'hello'
  >>> S[0:len(S):1]
  'hello'
  >>> S[None:None:1]
  'hello'

The upshot is that there is _no_ integer-default equivalent to omitted
limits when negative strides are used (only!).  The patch to the book's 
sentence hints as much, without devoting lots of space to an issue that's 
ultimately too obscure for prime time.  In the book's defense, it does already
show the following in the same section (and cover the slice() equivalence 
more in the classes part), which will match the patched clause well:

  >>> S[slice(None, None, -1)]   # how [::-1] works (despite the former clause)
  'olleh'

If you wish to dig further, try both Python's manual for a (painfully cryptic) 
definition of the slice operation, and the PySlice_GetIndices() function in 
the C implementation of slice objects for the true mapping of bounds:

  https://docs.python.org/3/library/stdtypes.html#common-sequence-operations

  https://github.com/python/cpython/blob/master/Objects/sliceobject.c


On Pathnames, Partitions, and Python

[May-2020] A reader wrote to ask for clarification on addressing files (or partitions) in Python. Since this seems a common, general-interest topic, my reply is pasted below.

> From: ...
> To: lutz@learning-python.com
> Subject: Partition in Windows or Linux
> Date: Sat, 23 May 2020 10:38:27 +0200
> 
> Hello, 
>  
> their Python books are very interesting. 
>  
> How to define a partition in Windows or Linux in Python 3? So e.g. "F:/" 
> or "sda3"? I know a definition of a file, is there a special term for 
> partitions? 
>  
> Thank you for your efforts 
 

Thanks for your note.  I'm not certain what you mean by partition.
If you're asking about a volume (a.k.a. disc, or drive), then these
are accessed with a pathname whose syntax varies per platform.  On 
Windows, for example, you might access a file on your main C: drive
with this in Python (Windows assigns letters like "C" to drives 
when they are attached; this is a Windows-only convention):

  open(r'C:\Users\yourid\Documents\file.txt')

Note the "r"; backslashes are special syntax in Python strings, so
the "r" serves to make them literal (else, they must be doubled as 
"\\").  Pathnames on Windows can also normally be coded with "/" 
instead of "\" to avoid this issue (and the "r") altogether.

On a Unix-based platform like Linux or Mac OS, the pathname syntax 
differs, and the location of drives is platform specific; here
are two file pathname examples from Mac OS and Linux for 
addressing content on the main drive (you can also generally
use a pathname based on volume name to get to your home folder):

  open('/Users/yourid/Documents/file.txt')
  open('/home/yourid/Documents/file.txt')

And here are examples of external-drive pathnames on Windows, Mac
OS, and Linux, respectively (though drive mount locations (and 
hence their pathnames) can vary on Linux, and network paths are 
more involved on Windows):

  open(r'D:\folder\file.txt')
  open('/Volumes/DRIVENAME/folder/file.txt')
  open('/media/yourid/DRIVENAME/folder/file.txt')

You didn't ask about Android specifically, but it's based on 
Linux with similar pathnames - though internal-storage paths can
vary, and external drives are assigned numeric IDs which are used 
to access their files.  For instance, here are two equivalent ways
you might address a file in internal storage on some recent 
Android systems (this has varied over both time and device):

  open('/sdcard/folder/file.txt')
  open('/storage/emulated/0/folder/file.txt')

And the pathname of a USB flashdive or MSD card may look like 
this on Android (25C9-1405 is the assigned drive ID):

  open('/storage/25C9-1405/Android/data/com.termux/file.txt')

There is a bit more coverage of pathnames on various platforms 
at these two links on my website:

  learning-python.com/mergeall-products/unzipped/UserGuide.html#finding

  learning-python.com/mergeall-android-scripts/_README.html#toc7

For an overview of Windows' proprietary pathnames, try this:

  docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

Python itself comes with two modules, the traditional os.path and
the newer but redundant pathlib, that allow scripts to process 
pathnames in a somewhat portable fashion (e.g., by abstracting the
use of "/" and "\" across platforms).  These help, but they can't 
negate all pathname content differences between platforms.

All that said, if you're instead asking about lower-level physical
partitions within a drive (e.g., of the sort you may split off for 
Linux in a dual-boot Windows/Linux scenario), then it looks like 
the "psutil" third-party module returns disk partition information,
but I'm not familiar with it or other similar tools; try a web 
search on "python disk partition" for more details.

Best wishes,
--Mark Lutz, https://learning-python.com


Examples Rezipped for Mac OS Catalina Bug

No reader has asked about this yet (which may not bode well for Mac OS Catalina), but I'm posting this preemptively to help anyone already burned by the issue.

[May-2020] This book's examples package was rezipped without content changes in May 2020, to work around a bug in Finder-click unzips on Mac OS Catalina. In short, Catalina clicks now fail with an "Error 22" on zipfiles that unzip correctly in both other tools and older Mac OS versions. The new examples-package zipfile is available on this website, here, and will hopefully show up at the publisher's book page soon.

For more details on the bug and work-around, see the top-level README.txt file in the examples package after unzipping. In brief, Mac OS Catalina's Finder clicks have a flaw that impacts many zipfiles, as chronicled in this thread. The prior examples zipfile hails from unknown tools on Windows in the past, but works correctly in other unzippers today; this is clearly a Catalina-only defect.

To fix, the prior zipfile was unzipped with the portable and Python-coded ziptools program available on this website, here. This program is recommended if you encounter unzip errors in the future. The extract used a simple command line like this, and made the example's folder where this was run ($Z is where you've unzipped ziptools itself):

$ py3 $Z/zip-extract.py lp5e-code-1.0-jun1813.zip . -permissions

After a rename, the package was rezipped with ziptools too, as follows:

$ py3 $Z/zip-create.py lp5e-code-1.0-may2020.zip lp5e-code-1.0-may2020 -skipcruft

The result unzips correctly in Finder clicks on Mac OS Catalina—and as it did on earlier Macs. Apple will probably fix the flaw eventually, but ziptools makes for a decent system-level case study if you're looking for larger Python examples.

More broadly, this episode underscores the impermanence of tools, code, and even content, in a field that churns and morphs constantly. In computing, if you wait long enough, stuff will break.


Python 3.8 Drops time.clock—and Breaks Code

[May-2020] A reader posted to this book's errata page to point out that the time.clock function, used in multiple book examples, has been removed as of Python 3.8. This was expected, and even foreshadowed, in the book, but it's yet another arguably unnecessary 3.X change that breaks code. Be sure to also see the notes about imp.reload and generator changes above here and here, if you haven't already; both those and the change described here require minor modifications to the book's program examples.

Errata-page post:
> [On page 652, about 1/3 from the top] 
> The code uses time.clock() and refers to Python 3.3.  However that function was
> deprecated in 3.3 (https://docs.python.org/3.3/library/time.html#time.clock) and 
> removed in 3.8 (https://github.com/python/cpython/pull/13270).


I appreciate your input, but this book was written when Python 3.3 was current, 
and lots of people were using both older versions of 3.X as well as 2.X that didn't
have the new tools.  Some still do today, despite what you may have heard.

The deprecation status of these time-module functions is clearly described in the 
sidebar on pages 655-656, which is referenced earlier at the top of page 654 (did 
you miss it?).  That sidebar also gives coding work-arounds to address future 
deprecation, which, as you note, is now officially complete.  Computer books cannot
predict the future, but in this case, it was fairly well foreshadowed.

But there's a larger point here.  If, as some might wish, this book were updated to
include only today's latest and greatest, it would both exclude those using older 
versions, and be lamented as dated just a few years hence.  This requires compromises
in both camps.  Unfortunately, those bent on changing Python incompatibly have never
seemed to be interested in compromise; more on this here:

  learning-python.com/python-changes-2014-plus.html#s381

That said, time-module differences are ultimately trivial compared to the task of 
mastering programming at large.  Most beginners are best served by learning the 
fundamentals covered by this book first, and exploring the transient cutting edge 
later, if and when needed.  The details of running aren't all that important until 
you've learned to walk.


Classes and __bases__: The Other Inheritance Tree

[Feb-2021] A reader wrote to ask how classes acquire the __bases__ superclass-links attribute from the type object. This is covered in a later part of the book which the reader hadn't yet reached (when on doubt, read on); but it's an interesting opening for exploring the parallel metaclass inheritance tree that's searched just for classes.

> From: ...
> To: lutz@learning-python.com
> Subject: Question about __bases__ attr in class
> Date: Thu, 4 Feb 2021 17:23:31 +0200
> 
> Good day!
> 
> First of all, thank you for your book! It's really great.
> 
> ISBN: 978-1-449-35573-9
> 
> Page 837. "...Classes also have __bases__ attribute. ..."
> 
> At this moment I checked:
> 
> >>class rec: pass
> >>dir(rec)
> 
> Out:
> ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__',
> '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__',
> '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__',
> '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
> '__repr__', '__setattr__', '__sizeof__', '__str__',
> '__subclasshook__', '__weakref__', 'age', 'name']
> 
> I found __bases__ in class generator "type"
> >>dir(type)
> 
> Out:
> [...'__bases__'...]
> 
> Now I'm a little bit confused.
> Seems like:
> class rec: pass
> rec.name = 'Sue'
> x = rec()
> 
> equal to
> 
> x = type('rec', (), dict(name='Sue'))
> 
> So every simple class is produced by generator type and inherits its 
> attributes?
> 
> Thank you!


Radically confusing, I agree; but the crux of the matter here is 
that inheritance at instances differs slightly from inheritance 
at classes.  The latter also searches the separate metaclass tree
that's available in a class's __class__ and linearized in the 
class's class's MRO, as sketched on page 1430 of the book.  The 
type object lives in the separate metaclass tree at classes, which 
is why __bases__ is acquired by classes but not instances.

In code, in Python 3.X:

  >>> class C: name = 'spam'
  ... 
  >>> I = C()
  >>> 
  >>> dir(I)
  [...no __bases__ here, but a __class__ links to C... ]
  >>> 
  >>> dir(C)
  [...no __bases__ here, but a __class__ links to type... ]
  >>>
  >>> I.__class__
  <class '__main__.C'>
  >>>
  >>> C.__class__
  <class 'type'>

Instance inheritance uses the following precomputed lists and
links, but __bases__ isn't among them:

  >>> I.__dict__
  {}
  >>> I.__class__.__mro__
  (<class '__main__.C'>, <class 'object'>)
  >>> 
  >>> 'name' in C.__dict__
  True
  >>> '__bases__' in C.__dict__
  False

Class inheritance instead gets __bases__ from type by virtue 
of the extra metaclass-tree search:

  >>> C.__class__.__mro__
  (<class 'type'>, <class 'object'>)
  >>> 
  >>> C.__class__.__mro__[0].__dict__['__bases__']
  <attribute '__bases__' of 'type' objects>
  >>>
  >>> C.__bases__
  (<class 'object'>,)

This isn't applied when looking from the instance - the class's
own __class__ link to the metaclass tree (and its MRO) is used
only when fetching from the class directly:

  >>> I.__bases__
  AttributeError: 'C' object has no attribute '__bases__'

This all works the same in dynamically created classes - the 
normal class tree used for instance inheritance doesn't have 
__bases__, but the separate metaclass tree does:

  >>> x = type('rec', (), dict(name='spam'))
  >>>
  >>> x.__class__
  <class 'type'>
  >>>
  >>> x.__mro__
  (<class '__main__.rec'>, <class 'object'>)
  >>>
  >>> x.__class__.__mro__
  (<class 'type'>, <class 'object'>)
  >>>
  >>> x.__bases__
  (<class 'object'>,)
  >>>
  >>> i = x()
  >>> i
  <__main__.rec object at 0x7f9336125610>

Hope this helps; it's arguably arcane (and prone to make heads 
explode), but it's the way new-style classes work in Python.

Best wishes,
--Mark Lutz, http://learning-python.com


Is LP5E a Good First Python Book for C Programmers?

[Nov-2019] A professor wrote to ask if this text would be a good first book for a C programmer getting started with Python. The reply pasted below hints at the difficulties of matching books to readers universally, and touches on the gap between deep and shallow learning.

> From: ...
> To: Mark Lutz 
> Subject: Re: Learning Python 6th Edition
> Date: Mon, 28 Oct 2019 22:46:53 +0000
> 
> Hi Mark,
> 
> Thank you for your reply.
> 
> Should this book be the first book for C programmers to learn about Python?


I suppose it depends on the C programmer.  Here are a few thoughts
to consider, both con and pro.

In the minus column: heavily experienced programmers who have used
lots of languages in the past may find the book's pace too slow.
Personally, I could probably learn Python from a simple reference
(like my Python Pocket Reference), along with a coding project to 
kick the tires.  In truth, I did learn Python when there were no 
books to be found, but I had two degrees in computer science; had 
used dozens of languages extensively; and studied programming 
languages broadly.  Lots of people have found the book useful, 
but I am not in Learning Python's target audience, and some 
experienced C programmers may not be either.

In the plus column: the book has a great deal of context for both 
less-experienced programmers, and those without a background in 
more dynamic languages like Python.  This includes C programmers
especially; Python is closer to tools like Lisp, and this is a 
very different approach to programming than C's close-to-the-metal 
paradigm.  In fact, C programmers may have to let go of a few habits 
to use Python well, and these are called out in the book.  Moreover,
the book draws numerous comparisons to C and C++, in part because 
many of the students in my classes were C/C++ programmers, but 
also because that was my background before coming to Python.  
Readers who've never used C/C++ may find such callouts and 
analogies pointless, but those who have may find them an asset 
which makes the book good for C users making the transition.

Also in the plus column: though more abstract, it's worth noting 
that there are different degrees of learning.  I also didn't need 
a book to learn enough Visual Basic to use it for coding simple 
Excel functions, but I really don't feel like I know it well 
enough to use more broadly.  Ditto for my use of JavaScript in 
web pages; I learned it largely by comparison to Python, but using
its own idioms properly would require more than I've done so far.
By contrast, a 1,600-page book can provide a much deeper learning 
experience than reference manuals and online resource ever can, 
and experienced C programmers looking for the full story may find
the book more useful than they initially expect.  In classes, it
was not uncommon to find C programmers who claimed to already know
Python, but who also regularly made fundamental Python mistakes 
that a more solid introduction might have prevented.

All of which you should judge for yourself, of course; I'm hardly
an impartial observer, and readers' needs vary widely.

Cheers,
--Mark Lutz, http://learning-python.com


Reading Recommendations for Python Web Development

[Jul-2020] A reader wrote to ask about reading guidance for learning Python web development. The reply below just scratches the surface of this domain, but provides a few pointers.

> From: ...
> To: lutz@learning-python.com
> Subject: Help me figure out how to become a PYTHON Web developer.
> Date: Sat, 18 Jul 2020 17:24:18 +0400
> 
> Hi Mr. Lutz, my name is ....., and I am a fan of the Python . I have
> completely read your book Learning Python. My goal is to become a Python
> web developer. I wanted to ask you for advice, what should I read now?
> I have your book "Python Programming". Is it worth reading this entire book
> or just a part of Web programming?
> 
> Thank you for your attention.


Thanks for your note.  Web development is a big domain, so I 
suppose it depends on which half of it you wish to emphasize.

1) Server-side domain

If you're interested in server-side work, Python fits it well,
and I recommend studying the entire stack: low-level systems 
programming (including multitasking and sockets); first-tier 
server-side techniques like CGI (which is a prerequisite for 
much else); as well as larger frameworks such as Flask and 
Django (which automate some of the lower-level work, but are 
much easier to use and debug after mastering the other levels).

Programming Python touches on many of these topics, though 
its GUI and C integration parts may be somewhat out of scope.
Frameworks like Flask are not covered by the book (in part 
because they change too often), but the book's fundamentals 
will help you better understand how such systems work.

2) Client-side domain

By contrast, client-side work is of course dominated by 
JavaScript, CSS, and HTML today, and Python tends to be used
more as a tactical tool in this domain.  I've posted a few 
representative examples of Python being applied in this role,
which I use almost daily to maintain my own website; see:

  https://learning-python.com/programs.html#internet

for Python scripts that generate HTML image galleries; 
implement site-search and code-display web pages; expand 
page macros at site build time; and more.  Python's a great 
tool for lots of tasks related to website development.  It 
also can handle web-page interactions that run on servers, 
though JavaScript is required for client-side page magic.

Programming Python's coverage of system tools, files, and 
text processing may be most relevant here, though its GUI 
topics may apply indirectly to JavaScript too; GUIs are GUIs,
and event-driven concepts are the same, whether you're using
tkinter+Python, or HTML+CSS+JavaScript with a DOM.  

It's also worth noting that JavaScript is a lot like Python
(albeit without system access in web pages); if you've mastered
Python programming fundamentals with books like mine, JavaScript 
should be a straightforward translation.  Moreover, it's not 
an either-or choice; my thumbspage program, for example, runs
Python to generate gallery pages, which in turn run JavaScript 
to size images.  Good engineers pick the tool to match the task. 

Best wishes,
--Mark Lutz, http://learning-python.com


Python Isn't Easy; It's Just Easier

[Jul-2022] A reader wrote to ask for help learning Python. The reply below gives my take on the task, along with some pointers on its size. I'm including it here because it's important to set expectations realistically in this field—especially when so many hucksters are bent on misleading newcomers out there.

> From: ...
> To: lutz@learning-python.com
> Subject: 
> Date: Thu, 14 Jul 2022 08:36:32 -0400
> 
> Hi Mark.
> 
> Can you help me learn Python? I have absolutely no coding experience at all
> and I am looking to get into it to get a better job to support my family.
> Will you please help me? I'm 40 and I feel this is way over my head.
> 
> Please help.


Hello,

I empathize with your situation, and once had a family to support
too, but there's not much I can do personally to assist.  I've
been blessed with a 7-digit readership, and naturally don't have
time to tutor every reader directly.  I'm happy to respond to  
focused questions about the book's material, but not much more.  
One thing I can offer here, though, is some friendly up-front 
advice that may help you set realistic goals and expectations.

First off, please know that programming is a skill which requires
a substantial time investment.  Even the basics are a large topic.
Online resources can't give you much more than surface-level intros,
so you should expect to go further if you wish to work in the field. 
In-depth books, in-person courses, and even multi-year degrees can 
improve both your knowledge base and job prospects enormously.

The marketing world has regrettably misled people into believing 
"coding" (an arguably derogatory term) is an easy skill to pick up.
It's not; computer science is a field as rich as law, medicine, or
the sciences.  To truly do well in it, you should learn it from the 
ground up - from computer architecture, to low-level languages like
assembler or C, to higher-level languages like Python.  The latter
makes most tasks easier, but to some extent assumes you know the
rest too, so you can recover when the sugar coating fails.

Even after the initial learning phase, you should expect to spend
years in practice to become proficient.  You can satisfy this 
time requirement with either work or a multi-year degree, but there
is no way to get around it.  Per the adage, practice does make 
perfect; you can and do learn as you go and on the job, but no 
skill can be mastered without applying it.

Having said all that, it's also true that some programming domains
may not require degrees, and can be picked up independently (e.g.,
web development, especially on the client side).  Even so, you 
shouldn't expect to learn these quickly, and should expect to spend
years honing the skills.  If you wouldn't expect to become a 
doctor in a hurry, you shouldn't expect the same of programming.

I should also note that everyone's path is different.  Mine took
BS and MS degrees plus decades of practice, but yours may differ
arbitrarily, so please take this advice with a healthy grain of 
salt.  I'm just trying to impart what I've seen of the entry 
requirements of this field, some of which you may already know,
and any of which may not fully apply to your plans.  In the end,
learning is as much about motivation as anything else.

However you proceed, I wish you all the best with whatever makes
sense for your life goals.  Programming is both rewarding and fun,
and I still do it today for pleasure instead of pay.  It's just not 
quite as trivial as some of the marketing implies.

Cheers,
--Mark Lutz, http://learning-python.com

 



[Home page] Books Code Blog Python Author Train Find ©M.Lutz