For related results, see the following pages:
[July 2009] The results of Python 3.1's I/O optimization are in (as measured by my test scripts, at least). They are listed in full below. As they show, 3.1 has closed the gap radically, though 2.6 is still slightly faster in most cases, and noticeably so in some output cases. For instance, 3.1 is 4-8 times slower on some text output tests, but is actually faster than 2.6 for some binary output cases. Moreover, there are no longer any pathologically bad cases in 3.1, and the worst cases (writing text) likely stem from the overhead of encoding Unicode text when writing it to files, an unavoidable cost of 3.X's string model.
For nearly all users, the I/O speed difference in 3.1 has become minor enough that it won't matter. To them, the choice between 2.6 and 3.1 now becomes more about subjective language design issues and pragmatic software dependencies than about raw I/O speed.
Although I personally think that releasing 3.0 with such a major performance issue was a bit of a misstep, the issue is now largely solved for most Python programmers (and may be further improved over time). That is, Python 3.1 makes Python 3.X a viable and practical tool. As usual, be sure to run these and other scripts on your own machine to verify the impact on your systems.
Below are the scripts I used to test:
For 3.1, I changed these slightly to run 3.1 instead of 3.0, and to skip trying to get the size of the non-existant output file on the first run. I also coded an alternative timer that takes the lowest (best) time instead of the average time to better discount system load fluctuation, but did not use it for these tests initially as it's not completely comparable. See lines labeled "# CHANGED" for changes made, and see the 3.0 test notes page for details on the input files I created for tests.
Sequence operations and iteration are still fairly close:
C:\test\py31io>C:\python26\python timeSEQ.py 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit (Intel)] forStatement => 4.237, [-10000, -9998]...[9996, 9998] listComprehension => 3.072, [-10000, -9998]...[9996, 9998] mapFunction => 3.035, [-10000, -9998]...[9996, 9998] generatorExpression => 3.484, [-10000, -9998]...[9996, 9998] C:\test\py31io>C:\python31\\python timeSEQ.py 3.1 (r31:73574, Jun 26 2009, 20:21:35) [MSC v.1500 32 bit (Intel)] forStatement => 4.741, [-10000, -9998]...[9996, 9998] listComprehension => 3.182, [-10000, -9998]...[9996, 9998] mapFunction => 3.726, [-10000, -9998]...[9996, 9998] generatorExpression => 4.051, [-10000, -9998]...[9996, 9998]
Full results show that file I/O has improved radically in 3.1:
C:\test\py31io>C:\python26\python timebothCMP.py Output data sizes: 524288 50 50 26214400 26214400 [Python 2.6.1: large.txt, 2935401 bytes] read_byLines_textMode (large.txt=2.80M) => 0.030083 read_byLines_binaryMode (large.txt=2.80M) => 0.019454 read_byBlocks_textMode (large.txt=2.80M) => 0.012025 read_byBlocks_binaryMode (large.txt=2.80M) => 0.002432 read_allAtOnce_textMode (large.txt=2.80M) => 0.013743 read_allAtOnce_binaryMode (large.txt=2.80M) => 0.003547 [Python 2.6.1: large.bin, 13168640 bytes] read_byBlocks_binaryMode (large.bin=12.56M) => 0.011582 read_allAtOnce_binaryMode (large.bin=12.56M) => 0.021405 [Python 2.6.1: testIO.out, 26214400 bytes] write_byLines_textMode (testIO.out=25.00M) => 0.550882 write_byLines_binaryMode (testIO.out=25.00M) => 1.518067 write_byBlocks_textMode (testIO.out=25.00M) => 1.663252 write_byBlocks_binaryMode (testIO.out=25.00M) => 0.648673 write_allAtOnce_textMode (testIO.out=25.00M) => 0.205738 write_allAtOnce_binaryMode (testIO.out=25.00M) => 0.711636 Output data sizes: 524288 50 50 26214400 26214400 [Python 3.1: large.txt, 2935401 bytes] read_byLines_textMode (large.txt=2.80M) => 0.046228 read_byLines_binaryMode (large.txt=2.80M) => 0.019537 read_byBlocks_textMode (large.txt=2.80M) => 0.029263 read_byBlocks_binaryMode (large.txt=2.80M) => 0.003759 read_allAtOnce_textMode (large.txt=2.80M) => 0.025534 read_allAtOnce_binaryMode (large.txt=2.80M) => 0.004716 [Python 3.1: large.bin, 13168640 bytes] read_byBlocks_binaryMode (large.bin=12.56M) => 0.015339 read_allAtOnce_binaryMode (large.bin=12.56M) => 0.028781 [Python 3.1: testIO.out, 26214400 bytes] write_byLines_textMode (testIO.out=25.00M) => 2.632230 write_byLines_binaryMode (testIO.out=25.00M) => 0.966703 write_byBlocks_textMode (testIO.out=25.00M) => 2.098336 write_byBlocks_binaryMode (testIO.out=25.00M) => 0.532392 write_allAtOnce_textMode (testIO.out=25.00M) => 1.579205 write_allAtOnce_binaryMode (testIO.out=25.00M) => 0.774765 ==Summary== [read_byLines_textMode (large.txt=2.80M)] => 3.1 is 1.537 times slower [read_byLines_binaryMode (large.txt=2.80M)] => 3.1 is 1.004 times slower [read_byBlocks_textMode (large.txt=2.80M)] => 3.1 is 2.434 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.1 is 1.546 times slower [read_allAtOnce_textMode (large.txt=2.80M)] => 3.1 is 1.858 times slower [read_allAtOnce_binaryMode (large.txt=2.80M)] => 3.1 is 1.330 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.1 is 1.324 times slower [read_allAtOnce_binaryMode (large.bin=12.56M)] => 3.1 is 1.345 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.1 is 4.778 times slower [write_byLines_binaryMode (testIO.out=25.00M)] => 3.1 is 0.637 times slower [write_byBlocks_textMode (testIO.out=25.00M)] => 3.1 is 1.262 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.1 is 0.821 times slower [write_allAtOnce_textMode (testIO.out=25.00M)] => 3.1 is 7.676 times slower [write_allAtOnce_binaryMode (testIO.out=25.00M)] => 3.1 is 1.089 times slower
The 3.1 full results summary above compares favorably with that of both 3.0.1 and 3.0 run earlier on the same test machine (see the links to the prior version's test pages at the top and bottom of this page for more details about their test results):
# 3.0.0 results [read_byLines_textMode (large.txt=2.80M)] => 3.0 is 44.222 times slower [read_byLines_binaryMode (large.txt=2.80M)] => 3.0 is 83.734 times slower [read_byBlocks_textMode (large.txt=2.80M)] => 3.0 is 55.707 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.0 is 1.910 times slower [read_allAtOnce_textMode (large.txt=2.80M)] => 3.0 is 55.298 times slower [read_allAtOnce_binaryMode (large.txt=2.80M)] => 3.0 is 131.986 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.0 is 1.532 times slower [read_allAtOnce_binaryMode (large.bin=12.56M)] => 3.0 is 564.707 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.0 is 12.212 times slower [write_byLines_binaryMode (testIO.out=25.00M)] => 3.0 is 1.776 times slower [write_byBlocks_textMode (testIO.out=25.00M)] => 3.0 is 3.509 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.0 is 2.740 times slower [write_allAtOnce_textMode (testIO.out=25.00M)] => 3.0 is 8.462 times slower [write_allAtOnce_binaryMode (testIO.out=25.00M)] => 3.0 is 0.527 times slower # 3.0.1 results [read_byLines_textMode (large.txt=2.80M)] => 3.0 is 34.401 times slower [read_byLines_binaryMode (large.txt=2.80M)] => 3.0 is 88.652 times slower [read_byBlocks_textMode (large.txt=2.80M)] => 3.0 is 10.254 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.0 is 1.884 times slower [read_allAtOnce_textMode (large.txt=2.80M)] => 3.0 is 6.638 times slower [read_allAtOnce_binaryMode (large.txt=2.80M)] => 3.0 is 4.283 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.0 is 1.489 times slower [read_allAtOnce_binaryMode (large.bin=12.56M)] => 3.0 is 10.988 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.0 is 8.886 times slower [write_byLines_binaryMode (testIO.out=25.00M)] => 3.0 is 2.008 times slower [write_byBlocks_textMode (testIO.out=25.00M)] => 3.0 is 4.095 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.0 is 4.189 times slower [write_allAtOnce_textMode (testIO.out=25.00M)] => 3.0 is 5.197 times slower [write_allAtOnce_binaryMode (testIO.out=25.00M)] => 3.0 is 0.820 times slowe
Paring down for common and robust use cases again leaves the following:
# 3.1 results [read_byLines_textMode (large.txt=2.80M)] => 3.1 is 1.537 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.1 is 1.546 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.1 is 1.324 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.1 is 4.778 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.1 is 0.821 times slower
This again compares favorably with prior release results:
# 3.0.1 results [read_byLines_textMode (large.txt=2.80M)] => 3.0 is 34.401 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.0 is 1.884 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.0 is 1.489 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.0 is 8.886 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.0 is 4.189 times slower # 3.0.0 results [read_byLines_textMode (large.txt=2.80M)] => 3.0 is 44.222 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.0 is 1.910 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.0 is 1.532 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.0 is 12.212 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.0 is 2.740 times slower
Finally, running with the timerBest() alternative to take the best time instead of the average time has little impact on the results in 3.1 -- a finding on timing techniques in general, not on 3.X speed:
==Summary== [read_byLines_textMode (large.txt=2.80M)] => 3.1 is 1.566 times slower [read_byLines_binaryMode (large.txt=2.80M)] => 3.1 is 1.022 times slower [read_byBlocks_textMode (large.txt=2.80M)] => 3.1 is 2.898 times slower [read_byBlocks_binaryMode (large.txt=2.80M)] => 3.1 is 1.364 times slower [read_allAtOnce_textMode (large.txt=2.80M)] => 3.1 is 1.945 times slower [read_allAtOnce_binaryMode (large.txt=2.80M)] => 3.1 is 0.997 times slower [read_byBlocks_binaryMode (large.bin=12.56M)] => 3.1 is 1.295 times slower [read_allAtOnce_binaryMode (large.bin=12.56M)] => 3.1 is 1.000 times slower [write_byLines_textMode (testIO.out=25.00M)] => 3.1 is 4.684 times slower [write_byLines_binaryMode (testIO.out=25.00M)] => 3.1 is 0.553 times slower [write_byBlocks_textMode (testIO.out=25.00M)] => 3.1 is 1.250 times slower [write_byBlocks_binaryMode (testIO.out=25.00M)] => 3.1 is 0.876 times slower [write_allAtOnce_textMode (testIO.out=25.00M)] => 3.1 is 8.042 times slower [write_allAtOnce_binaryMode (testIO.out=25.00M)] => 3.1 is 1.288 times slower