[Update] For related and more recent results, also see the following pages:
This page documents a major performance regression that existed in Python 3.0 (3.0.0) which made most I/O operations much slower than they were in Python 2.X—up to 1,000 times slower in some cases.
This issue was largely repaired in later 3.X releases: Python 3.0.1 included minor improvements, and Python 3.1 fixed the problem more broadly, making this specific issue a moot point for most Python users today. That being said, 3.X's performance still generally lags behind that of 2.X for many other types of code not measured here; see Learning Python's benchmarking chapter for details.
This page describes the original 3.0 speed issue and the timing techniques used in testing, for historical context, and for users who might have 3.0.0 installed. The two pages listed above provide updated results to show the improvements made in later 3.X releases, but they employ testing techniques described on this page, so you'll probably want to start reading here first.
[Original post: January 2009] A crucial factor for many programmers to consider, 3.0 has major performance issues that may preclude its widespread use until a future optimizing release. While some operations run roughly as fast in 3.0 as in 2.6, others do not. In particular, file I/O is so slow in 3.0 as to be completely impractical for a large set of Python programs.
Before I get into details, I want to stress that this issue is hopefully a temporary one, and you probably don't need to care about it at all if you process small files only. This is primarily an issue for programs that spend a significant amount of their time scanning large files. However, that's common enough that the issue is sufficient cause for many programmers to avoid using the 3.X line until an optimized version is released (perhaps 3.0.2 or 3.1, given current development plans).
Unfortunately, this problem doesn't seem to be getting the priority it deserves today. Perhaps worse, broad 3.0 design decisions such as the string/unicode merge and I/O library redesign may make it difficult to ever bring 3.X up to speed with 2.X in terms of I/O performance. If you care about I/O speed, please feel free to help elevate this in the Python world.
So how bad is it? According to posted benchmarks starting to filter in, some binary I/O can be 3 to 4 times slower in 3.0, writing text files can be 5 to 8 times slower, and reading text files line-by-line can run 40 to 70 times slower in 3.0 than in 2.X (that's times, not percent!). Reading large files all-at-once can be even worse -- hundreds of times slower in 3.0 typically, and sometimes even slower than that.
This is apparently due to an I/O library rewrite (which aims to replace some of the underlying C library), changes in buffer allocation schemes, and possibly the new all-Unicode focus. The net effect can be observed in one posted test, in which a simple line-by-line iteration over a fairly large (66M) file:
for line in open('aBigFile.txt'): pass
takes roughly half a second (0.65) in Python 2.6, but will take between 32 and 42 seconds under Python 3.0, depending on whether the file is opened in text or binary modes. That is, 3.0 is between 48 and 66 times slower than 2.6 when reading text line-by-line.
Since I/O is a major time component for many programs that scan or massage data, going from half a second to 30 or 40 seconds is indeed a BIG DEAL, and a major disappointment to many Python users (myself included). Another way to look at this is that I/O time for programs that read text by lines has essentially increased from N seconds to N minutes in 3.0 -- enough to qualify as impractical for many programs.
Although reading line-by-line is probably the most common way to process text, this slowdown can get better or worse, depending on how you process files (though it's present in all normal modes). A formal timing benchmark written by Python developers shows that, compared to 2.6, 3.0 is:
Much worse, according to some reports reading large files all at once might take up to a shocking 1,000 times longer in 3.0. For instance, one benchmark posted on comp.lang.python shows a .read() of a 17M file going from 33 msecs in 2.5, to 36.8 seconds in 3.0. That's a slowdown of 3 orders of magnitude, or 1,000X. Additionally, some tests show that:
Python 3.0 runs slower than 2.X overall too (3.0 reportedly runs the pystone benchmark 10% slower than 2.5), but its I/O speed is a critical regression. In fact, it's not an exaggeration to say that 3.0 has effectively broken Python performance for many users, at least until this is resolved.
For more background on this speed issue, see the next section, as well as the following web pages from which I gleaned some of the statistics above (try a web search for additional discussions):
Since writing the prior section, I've run some tests of my own to verify the speed problem in 3.0. Their code is available off-page, if you want to fetch and try on your own:
The short story is that I/O is at least as bad in 3.0 as noted by others, though non-IO operations seem roughly as fast. In a bit more detail, the first of the files listed above times a non-IO CPU intensive operation -- sequence iteration alternatives. When timing an operation like this, 2.6 is only slightly faster than 3.0:
C:\misc>C:\Python26\python timeSEQ.py 2.6 (r26:66721, Oct 2 2008, 11:35:03) [MSC v.1500 32 bit (Intel)] forStatement => 5.517, [-10000, -9998]...[9996, 9998] listComprehension => 3.958, [-10000, -9998]...[9996, 9998] mapFunction => 4.054, [-10000, -9998]...[9996, 9998] generatorExpression => 4.560, [-10000, -9998]...[9996, 9998] C:\misc>C:\Python30\python timeSEQ.py 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit (Intel)] forStatement => 5.753, [-10000, -9998]...[9996, 9998] listComprehension => 4.399, [-10000, -9998]...[9996, 9998] mapFunction => 4.492, [-10000, -9998]...[9996, 9998] generatorExpression => 4.968, [-10000, -9998]...[9996, 9998]
However, the rest of the files listed above time file I/O specifically. The last listed file summarizes the relative speeds of I/O on my Windows Vista laptop, under both 2.6 and 3.0. For a text test file, I concatenated Python 2.6's NEWS.txt file to itself multiple times; my binary test file is Python 3.0's MSI installer file for Windows; and the write tests each produce a 25M file, whether text or binary.
As you can see from the results file listed below, my results are in line with those reported by others and summarized above. Python 3.0 is much slower -- typically around 50 times slower, and over 800 times slower in a worse case. Moreover, writing data can be up to 12 times slower than 2.6, depending on the way data is written:
Output data sizes: 524288 50 50 26214400 26214400 [Python 2.6: large.txt, 2935401 bytes] read_byLines_textMode (large.txt=2.80M) => 0.024772 read_byLines_binaryMode (large.txt=2.80M) => 0.015806 read_byBlocks_textMode (large.txt=2.80M) => 0.009921 read_byBlocks_binaryMode (large.txt=2.80M) => 0.001622 read_allAtOnce_textMode (large.txt=2.80M) => 0.011466 read_allAtOnce_binaryMode (large.txt=2.80M) => 0.003305 [Python 2.6: large.bin, 13168640 bytes] read_byBlocks_binaryMode (large.bin=12.56M) => 0.010219 read_allAtOnce_binaryMode (large.bin=12.56M) => 0.016566 [Python 2.6: testIO.out, 26214400 bytes] write_byLines_textMode (testIO.out=25.00M) => 0.453313 write_byLines_binaryMode (testIO.out=25.00M) => 1.235918 write_byBlocks_textMode (testIO.out=25.00M) => 1.464019 write_byBlocks_binaryMode (testIO.out=25.00M) => 0.560694 write_allAtOnce_textMode (testIO.out=25.00M) => 0.174307 write_allAtOnce_binaryMode (testIO.out=25.00M) => 0.579587 Output data sizes: 524288 50 50 26214400 26214400 [Python 3.0: large.txt, 2935401 bytes] read_byLines_textMode (large.txt=2.80M) => 1.024868 read_byLines_binaryMode (large.txt=2.80M) => 1.299153 read_byBlocks_textMode (large.txt=2.80M) => 0.539751 read_byBlocks_binaryMode (large.txt=2.80M) => 0.002815 read_allAtOnce_textMode (large.txt=2.80M) => 0.718675 read_allAtOnce_binaryMode (large.txt=2.80M) => 0.670614 [Python 3.0: large.bin, 13168640 bytes] read_byBlocks_binaryMode (large.bin=12.56M) => 0.014730 read_allAtOnce_binaryMode (large.bin=12.56M) => 13.695692 [Python 3.0: testIO.out, 26214400 bytes] write_byLines_textMode (testIO.out=25.00M) => 5.409266 write_byLines_binaryMode (testIO.out=25.00M) => 2.160819 write_byBlocks_textMode (testIO.out=25.00M) => 4.927914 write_byBlocks_binaryMode (testIO.out=25.00M) => 1.884943 write_allAtOnce_textMode (testIO.out=25.00M) => 1.313119 write_allAtOnce_binaryMode (testIO.out=25.00M) => 0.603439 ==Summary== [read_byLines_textMode (large.txt=2.80M)] => 3.0 is 41.372 times slower [read_byLines_binaryMode (large.txt=2.80M)] => 3.0 is 82.194 times slower [read_byBlocks_textMode (large.txt=2.80M)] => 3.0 is 54.405 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.0 is 1.736 times slower [read_allAtOnce_textMode (large.txt=2.80M)] => 3.0 is 62.679 times slower [read_allAtOnce_binaryMode (large.txt=2.80M)] => 3.0 is 202.909 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.0 is 1.441 times slower [read_allAtOnce_binaryMode (large.bin=12.56M)] => 3.0 is 826.735 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.0 is 11.933 times slower [write_byLines_binaryMode (testIO.out=25.00M)] => 3.0 is 1.748 times slower [write_byBlocks_textMode (testIO.out=25.00M)] => 3.0 is 3.366 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.0 is 3.362 times slower [write_allAtOnce_textMode (testIO.out=25.00M)] => 3.0 is 7.533 times slower [write_allAtOnce_binaryMode (testIO.out=25.00M)] => 3.0 is 1.041 times slower
Please study the linked source files for more details on these tests, but a few notes are in order:
Most of the tests I run are valid use cases for 3.0, as well as common 2.6 coding
patterns that will be migrated to 3.0. Strictly speaking, because new 3.0 programs
will likely make a binding choice between str for text data and bytes for binary data,
the input tests "read_byLines_binaryMode" and "read_byBlocks_textMode", as well
as the output tests "write_byLines_binaryMode" and "write_byBlocks_textMode"
will probably be relatively atypical in 3.0 practice. Discounting these tests
leaves the following:
[read_byLines_textMode (large.txt=2.80M)] => 3.0 is 41.372 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.0 is 1.736 times slower [read_allAtOnce_textMode (large.txt=2.80M)] => 3.0 is 62.679 times slower [read_allAtOnce_binaryMode (large.txt=2.80M)] => 3.0 is 202.909 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.0 is 1.441 times slower [read_allAtOnce_binaryMode (large.bin=12.56M)] => 3.0 is 826.735 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.0 is 11.933 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.0 is 3.362 times slower [write_allAtOnce_textMode (testIO.out=25.00M)] => 3.0 is 7.533 times slower [write_allAtOnce_binaryMode (testIO.out=25.00M)] => 3.0 is 1.041 times slower
Further, the "allAtOnce" variants will not work for pathologically large files, too big for your computer's memory space. If we discount these tests as well in the interest of robust programs, we're left with the following cases:
[read_byLines_textMode (large.txt=2.80M)] => 3.0 is 41.372 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.0 is 1.736 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.0 is 1.441 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.0 is 11.933 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.0 is 3.362 times slower
This still doesn't look great for 3.0, but at least it omits some of the more negative findings. I don't think it's valid to discount all the other cases this way, though, for two reasons:
In any event, even without some cases that are arguably less common or ideal in 3.0, it still comes out 41 times slower for reading lines from a text file, 50% slower for reading binary data, and 3 to 12 times slower for writing data.
Please feel free to fetch and play with these tests on your own, varying test file sizes and other parameters. You should also test your specific file usage patterns. My tests try to capture all common and valid file processing modes for 2.6 and 3.0, but they are not necessarily universally applicable. Further, system caching, test ordering, and other factors can impact test outcomes, so always verify on your own. At the least, you can rerun these scripts in future 3.X releases to see if the issue has been resolved.