Pages: 1, 2
Finally, here are some of the problems you may come across when you start working with the larger features of the Python language — datatypes, functions, modules, classes, and the like. Because of space constraints, this section is abbreviated, especially with respect to advanced programming concepts; for the rest of the story, see the tips and "gotchas" sections of Learning Python, 2nd Edition.
When you use the open()
call in Python to access an external file, Python does not use
the module search path to locate the target file. It uses an absolute path you give, or
assumes the filename is relative to the current working directory. The module search path
is consulted only for module imports.
You can't use list methods on strings, and vice versa. In general, methods calls are type-
specific, but built-in functions may work on many types. For instance, the list reverse
method only works on lists, but the len
function works on any object with a length.
Remember that you can't change an immutable object (e.g., tuple, string) in place:
T = (1, 2, 3) T[2] = 4 # Error
Construct a new object with slicing, concatenation, and so on, and assign it back to the original variable if needed. Because Python automatically reclaims unused memory, this is not as wasteful as it may seem:
T = T[:2] + (4,) # Okay: T becomes (1, 2, 4)
for
Loops Instead of while
or range
When you need to step over all items in a sequence object from left to right, a simple
for
loop (e.g., for x in seq:
) is simpler to code, and usually quicker to run, than a
while
- or range
-based counter loop. Avoid the temptation to use range
in a for
unless you really have to; let Python handle the indexing for you. All three of the
following loops work, but the first is usually better; in Python, simple is good.
S = "lumberjack" for c in S: print c # simplest for i in range(len(S)): print S[i] # too much i = 0 # way too much while i < len(S): print S[i]; i += 1
In-place change operations such as the list.append()
and list.sort()
methods modify an
object, but do not return the object that was modified (they return None
); call them
without assigning the result. It's not uncommon for beginners to say something like:
mylist = mylist.append(X)
to try to get the result of an append
; instead, this assigns mylist
to None
, rather than the
modified list. A more devious example of this pops up when trying to step through
dictionary items in sorted-key fashion:
D = {...} for k in D.keys().sort(): print D[k]
This almost works — the keys
method builds a keys list, and the sort
method orders it — but
since the sort
method returns None
, the loop fails because it is ultimately a loop over
None
(a nonsequence). To code this correctly, split the method calls out into statements:
Ks = D.keys() Ks.sort() for k in Ks: print D[k]
In Python, an expression like 123 + 3.145
works — it automatically converts the integer
to a floating point, and uses floating point math. On the other hand, the following fails:
S = "42" I = 1 X = S + I # A type error
This is also on purpose, because it is ambiguous: should the string be converted to a number (for addition), or the number to a string (for concatenation)?. In Python, we say that explicit is better than implicit (that is, EIBTI), so you must convert manually:
X = int(S) + I # Do addition: 43 X = S + str(I) # Do concatenation: "421"
Although fairly rare in practice, if a collection object contains a reference to itself, it's
called a cyclic object. Python prints a [...]
whenever it detects a cycle in the object,
rather than getting stuck in an infinite loop:
>>> L = ['grail'] # Append reference back to L >>> L.append(L) # Generates cycle in object >>> L ['grail', [...]]
Besides understanding that the three dots represent a cycle in the object, this case is worth knowing about because cyclic structures may cause code of your own to fall into unexpected loops if you don't anticipate them. If needed, keep a list or dictionary of items already visited, and check it to know if you have reached a cycle.
This is a core Python concept, which can cause problems when its behavior isn't
expected. In the following example, the list object assigned to the name L
is referenced both
from L
and from inside of the list assigned to name M
. Changing L
in place changes what M
references, too, because there are two references to the same object:
>>> L = [1, 2, 3] # A shared list object >>> M = ['X', L, 'Y'] # Embed a reference to L >>> M ['X', [1, 2, 3], 'Y'] >>> L[1] = 0 # Changes M too >>> M ['X', [1, 0, 3], 'Y']
This effect usually becomes important only in larger programs, and shared references are normally exactly what you want. If they're not, you can avoid sharing objects by copying them explicitly; for lists, you can make a top-level copy by using an empty-limits slice:
>>> L = [1, 2, 3] >>> M = ['X', L[:], 'Y'] # Embed a copy of L >>> L[1] = 0 # Change only L, not M >>> L [1, 0, 3] >>> M ['X', [1, 2, 3], 'Y']
Slice limits default to 0 and the length of the sequence being sliced. If both are omitted,
the slice extracts every item in the sequence, and so makes a top-level copy (a new,
unshared object). For dictionaries, use the dict.copy()
method.
Python classifies names assigned in a function as locals by default; they live in the
function's scope and exist only while the function is running. Technically, Python detects
locals statically, when it compiles the def
's code, rather than by noticing assignments as
they happen at runtime. This can also lead to confusion if it's not understood. For example,
watch what happens if you add an assignment to a variable after a reference:
>>> X = 99 >>> def func(): ... print X # Does not yet exist ... X = 88 # Makes X local in entire def ... >>> func() # Error!
You get an undefined name error, but the reason is subtle. While compiling this code,
Python sees the assignment to X
and decides that X
will be a local name everywhere in
the function. But later, when the function is actually run, the assignment hasn't yet
happened when the print executes, so Python raises an undefined name error.
Really, the previous example is ambiguous: did you mean to print the global X
and then
create a local X
, or is this a genuine programming error? If you really mean to print
global X
, you need to declare it in a global statement, or reference it through the
enclosing module name.
Default argument values are evaluated and saved once, when the def
statement is run,
not each time the function is called. That's usually what you want, but since defaults
retain the same object between calls, you have to be mindful about changing mutable
defaults. For instance, the following function uses an empty list as a default value and
then changes it in place each time the function is called:
>>> def saver(x=[]): # Saves away a list object ... x.append(1) # and changes it each time ... print x ... >>> saver([2]) # Default not used [2, 1] >>> saver() # Default used [1] >>> saver() # Grows on each call! [1, 1] >>> saver() [1, 1, 1]
Some see this behavior as a feature — because mutable default arguments retain their state between function calls, they can serve some of the same roles as static local function variables in the C language. However, this can seem odd the first time you run into it, and there are simpler ways to retain state between calls in Python (e.g., classes).
To avoid this behavior, make copies of the default at the start of the function body with slices or methods, or move the default value expression into the function body; as long as the value resides in code that runs each time the function is called, you'll get a new object each time:
>>> def saver(x=None): ... if x is None: x = [] # No arg passed? ... x.append(1) # Changes new list ... print x ... >>> saver([2]) # Default not used [2, 1] >>> saver() # Doesn't grow now [1] >>> saver() [1]
Here's a quick survey of other pitfalls we don't have space to cover in detail:
Statement order matters at the top level of a file: because running or importing a file runs its statements from top to bottom, make sure you put unnested calls to functions or classes below the definition of the function or class.
reload
doesn't impact names copied with from
: reload
works much better with the import
statement. If you use from
statements, remember to rerun the from
after the reload
, or you'll still have old names.
The order of mixing matters in multiple inheritance: because superclasses are searched left to right, according to the order in the class header line, the leftmost class wins if the same name appears in multiple superclasses.
Empty except
clauses in try
statements may catch more than you expect. An except
clause in a try
that names no exception catches every exception — even
things like genuine programming errors, and the sys.exit()
call.
Bunnies can be more dangerous than they seem.
Mark Lutz is the world leader in Python training, the author of Python's earliest and best-selling texts, and a pioneering figure in the Python community since 1992.
O'Reilly & Associates recently (in December 2003) released Learning Python, 2nd Edition.
The latest information on this title (other links here are defunct).
Sample Chapter 19, "OOP: The Big Picture," is available free online.
You can also look at the Table of Contents, the Index, and the full description of the book.
For more information, or to order the book, click here.
Return to the Python DevCenter.