One of the work items that kept me occupied over the past week was writing a computational physics simulation modelling “quantum potential wells using a Monte Carlo simulation”:http://www.phys.soton.ac.uk/teach/year3/notes/ph314/projects/pjtb2.pdf.
I chose to write the simulation in “Python”:http://www.python.org instead of C - my feeling was this would be quicker to get a working simulation with, and I could always rewrite in C if it was very slow. i didn’t expect it to be - I’d read a lot of reports about Scientists using Python, so figured it would fine. I also picked Python as this would force me to learn the language, something that if I pursue “a particular route”:http://www.ses.soton.ac.uk/projects/Comp_Eng_Des/comp_eng_des.html after graduating would be very useful.
Due to some problems, the lecturer overseeing this simulation ended up writing the simulation also, in the departmental version of Basic.
At that point I realised how slow the Python code was. I didn’t expect it to be as fast as C, but was surprised by how much faster the Basic version was. It wasn’t impossibly slow, but running a million random walks of up to 300 steps was taking nearly 2 hours to complete. I tried plugging in Psyco to speed up the Python code, without much luck.
I don’t know if there were ways to get the Python code faster (short of using Pyrex), but I’m posting the code here to see what people think. Given it was the first proper Python program I’ve done I expect it to be a bit slow, but wasn’t expecting it to be this much slower.
One other point that came out was that the Python code produced a greater scatter on the results than the Basic version. I think this has to come down to the random number generator, but thoughts on this would be appreciated as well.
To the code:
* “Python code for a 1 dimensional potential well”:/physics/simulations/rand_walks_qm/1D_0.py
* “New Forest Basic code for a 1 dimensional potential well”:/physics/simulations/rand_walks_qm/1D_0.nfb (slightly more basic than mine - walks aren’t killed after X steps, but continue to arrest). You’ll also need the editor/compiler/interpreter (”Windows”:http://www.phys.soton.ac.uk/teach/year2/notes/lab/software/nfbwin.exe or “Linux”:http://www.phys.soton.ac.uk/teach/year2/notes/lab/software/nfblnx) - which in turn requires “TCL”:http://www.tcl.tk/ to be installed.
* Python code for a 2 dimensional potential well (”ground state”:/physics/simulations/rand_walks_qm/2D_0.py , “first potential”:/physics/simulations/rand_walks_qm/2D_1.py , “second potential”:/physics/simulations/rand_walks_qm/2D_2.py ). Note that this code does not relate well to the real values expected; I didn’t work out what the problem was before I had to hand in the report.
I just ran across dirtSimple’s post “The perception of speed”:http://dirtsimple.org/2004/12/perception-of-speed.html today, so it seems I’ve written this post at a time when the subject’s topical ![]()
29 comments ↓
I tried a few things and 1D_0.py went from 15s to 5s on my machine. The easiest is to put all of the code after the psyco construct into a function before the psyco construct and call it after the psyco contruct. psyco only optimizes code inside functions and methods. That block of code is a big portion of the runtime of your program so you’ll want psyco to see it.
Everything else offers smaller gains, but optimizations like these often lead to more readability as well.
You can replace the two lines that initialize “bins” with “bins = [0] * MAXSTEPS”.
Your implementation of RandomWalk.walk can be replaced with “self.directions[ 0 ] += random.choice((-1, 1))”.
Your implementation of RandomWalk.arrested can be replaced with “return self.distance() == J”.
The loop in the distance method does a bunch of unneeded lookups. Instead try “for coordinate in self.directions: sum += coordinate**2″.
Any calls to _newlist (which doesn’t appear to be used in 1D) can be replaced with “[defaultvalue] * length”.
But these are all incremental improvements. The place where Python performance really shines is in algorithmic experimentation. Once you have a certain level of proficiency in Python, you will save enough development time over lower level languages, that you’ll have time to experiment with alternate algorithms. Instead of getting 10% and 20% speed improvements, this can lead to 5x or 10x improvements or more.
Couple of comments — the previous commenter pointed out some good ideas, and let me add:
Something that is good for pretty much any programming language is storing intermediate results and then referencing the variable rather than calling the function again. You are using len() in a lot of tight loops, and while the call is fast, it adds up.
Look into simplifying the math — sqrt is an expensive call, and unnecessary if all you are doing is comparing. Square the number that you are going to compare against the calculated distance, and then eliminate the call to sqrt.
Learn list comprehensions. Python loves them and optimizes the heck out of them. If you are coding the construct “for i in range(len(foo)): foo[i] = something(foo[i])” a lot, try using “[something(x) for x in foo]”
I’ll repeat myself — Python loves list comprehensions. Learn them, use them, love them.
Learn the standard library — the random.choice() method, mentioned by the previous caller, can replace almost your entire walk() method.
Before attempting to optimize, use the Python profiler to find out where the slowness actually is. It may surprise you. It usually surprises me.
Hmmm. I did a direct basic to python translation of the professors code, and wanted to try sending it to you, but I couldn’t find your address anywhere.
Send me an email if you’d like to see it.
Thanks David. For your use and others, my email address is
peter
#not this
@
#not this
mapledesign.co.uk
Try using
dirn = random.random() < 0.5
instead of
dirn = random.randint(0,1).
The random() call is much faster than randint(), for some reason.
This, combined with the sorts of changes Jimmy Retzlaff advised, actually makes the python script speedy on my system.
First, some style comments.
* It’s a good practice to avoid using the names of builtins as variable or function names. In particular, your filter function shadows the builtin of that name, and the sum variable in RandomWalk.distance.
* Declaring directions in the class definition (above the __init__) is not necessary.
* You call random.seed() every time you create a RandomWalk object — this is usually a bad idea. You should do this once at the beginning of the program.
* RandomWalk.distance() can be written simpler:
s = 0
for d in self.directions:
s += d**2
return s
where I return distance**2 instead, and change the compare in RandomWalk.walk() to compare against the (precomputed) J**2.
* Inlining RandomWalk.distance() into RandomWalk.arrested gave me a 10% speed-up.
* Your other programs for the 2D well are all similiar. I find it’s much easier, instead of maintaining several programs that differ by little bits, to make a module that encompasses most of the functionality, and subclass portions as necessary. This is of course a general software practice, and hopefully obvious, but I’ve see *a lot* of copy-and-pasted scientific code!!
* I’d pass J as a parameter to RandomWalk.
* Try moving the loop that walks the walker into the RandomWalk object. I’d probably make it a generator, returning True or False for each step. That won’t necessarily make it faster (now), but it does allow you do optimizations in one place in the loop, like inlining the arrested() test.
To make this even faster, I would seriously consider using Pyrex. I’d probably use the GSL library for the random-number generator, and make sure I’m calling C routines instead of Python ones (C’s sqrt instead of Python’s sqrt, for instance). A quick implementation of 1D_0.py gave a 4x speedup. More work speeds it up to 25x the original.
That’s all I got. Hope it helps.
If you’re using Python 2.4, try rewriting the ‘arrested’ method as:
return sqrt(sum(x*x for x in self.directions))==J
Or better yet:
return sum(x*x for x in self.directions)==J2
(where J2 = J*J)
This should give you an enormous speedup. You should also get another small speedup by computing ‘range(MAXSTEPS)’ *once*, and keeping it in a constant, rather than computing it 20,000 times as you are currently doing. There are other inefficient things being done in your code, but these two are probably the most immediate in terms of bang for the buck.
Also, the Basic program isn’t doing the same thing as your program is doing — it’s doing several things differently, including the fact that it’s not doing any floating point math and doesn’t make any function calls, so of course it’s going to be faster! This is an apples-to-oranges comparison. If you wrote a straightforward translation of the Basic program, it would be a hell of a lot faster than your Python program, and would probably produce similar output as well.
Scientists who use Python use “SciPy”:http://www.scipy.org/
I rewrote the ‘interesting’ part of the fastwalk using SciPy (mostly Numeric). The 2D case runs in 14 seconds, compared to 40 seconds with your original 1D code. This is n-dimensional. Further questions, try mierle with an at gmail.com.
“fastwalk.py”:http://www.ecf.utoronto.ca/~mierle/fastwalk.py
You just have to learn to think in terms of vectorizing. Think, “How can I randomly walk the drunkards, all at the same time?” Check out my code for more details.
(Reposting from an email copy. Linefeeds for the code examples may cause the code to be wrapped in strange ways)
Saw your comments referred to on the Python Daily URL
So as best I can tell from your first code sample the problem is related to doing statistics on a histogram of how many steps it takes a random walk to go a certain distance.
First of all, I’m not sure how you could have read about scientific use of Python without also reading about using it with one of the two array packages available for it. For most cases,Python is not nearly as useful for scientific programming without using one of these if you want to write all your code in Python (it still is very useful for scripting codes written in C, C++, and Fortran). As you can see, large numbers of loops over simple operations will kill you.
So run, don’t walk, to either numarray or Numeric (aka numpy). You goal should be to do as much in array manipulations as possible rather than explicit loops. Some problems lend themselves well to that, some don’t. This looks like it lends itself moderately well (it will still be slower than corresponding C code by a factor of at least a few) but should be way faster than the way you did it. The sample I show below is wasteful in that it will compute the random walk for all steps, not just the “arrest”. I’m guessing that as long as the average number for the arrest is large enough, it isn’t too wasteful. There are other ways of coding this that are a little more complex that will approach C speed. Coding things using array manipulations require thinking about the problem in a different way. At some point, the contortions to do that aren’t worth it, but this doesn’t appear to be one of those cases.
So here is an alternate way of coding your example (note that it is a lot shorter too) using numarray (why numarray? Because we wrote it :-). The results don’t seem to agree quite with yours. I probably have an off by 2 error in what you call y at least but that doesn’t account for all of it. This version runs about 10 times faster on my computer as compared to the non psyco version, 3 seconds vs 32 seconds (though I’m not folding the the startup time of starting python and importing numarray). It can be made faster still by performing the calculation on the ensemble of drunkards (as a 1-d array), and only propagating the calculation for those that haven’t been arrested (each iteration results in a smaller ensemble).
You can also look at tools like weave in scipy for speeding up numerical computations (it can convert expressions to C). It depends on how close to C you want to get.
from numarray import *
from numarray.random_array import randint
from numarray.nd_image import histogram
MAXSTEPS = 300
NUM_DRUNKARDS = 20000
J = 12
DIRECTIONS = 1
def rw():
# This will use a lot of memory if maxsteps * num_drunkards is
large.
# In that case just loop over a large ensemble. The overhead of looping
# over large computations is negligible.
rd = randint(0,2, (NUM_DRUNKARDS, MAXSTEPS))
# turn into -1, +1 values
rd = 2*rd - 1
# accumulate for net distance along axis (not hard to generalize
# to n dimensions)
dist = abs(add.accumulate(rd, axis=1))
# select cases where arrest occurs
arrests = where(logical_or.reduce(dist >= J, axis=1))
print “number of arrests = “, len(arrests[0]), ” out of “,
NUM_DRUNKARDS, ” trials”
# Now construct histogram
# First determine location of first arrest
arrestdist = dist[arrests] # selecting only cases with arrests
arreststep = (arrestdist >= J).argmax() # index location of first
dist=j
arresthist = histogram(arreststep,0,MAXSTEPS, MAXSTEPS+1)
# Not quite following what was meant by survivor and the filter, but
# this seems to be the equivalent calculation (but my results don’t
agree)
survivors = arresthist[1::2].astype(Float32) # only even values can
be nonzero
x = survivors.copy()
x[where(x == 0.)] = e
x = log(x) # you did want natural logs, right?
y = arange(len(x))*2
yerr = 0.*survivors
nzs = where(survivors > 0)
yerr[nzs] = 1./sqrt(survivors[nzs])
for i in arange(len(y)): # instead of to a file as the example does
print y[i], x[i], yerr[i]
I’ve somewhat shortiefied/speedified the 1d example.
see source here http://codeflow.org/files/1d_0
Python kinda sorta slow
The ‘Python code for a 1 dimensional potential well’ measures nothing more as that python function calls are slow.
Use the python profiler to see this fact. Remove unused code and clean up all for higher readability.
Read:
http://trific.ath.cx/resources/python/optimization/
http://www.python.org/doc/essays/list2str.html
http://manatee.mojam.com/~skip/python/fastpython.html
http://www.python.org/doc/faq/programming.html
Part 1.1.5 My program is too slow. How do I speed it up?
Re. a previous comment:
“”"Learn list comprehensions. Python loves them and optimizes the heck out of them. If you are coding the construct “for i in range(len(foo)): foo[i] = something(foo[i])” a lot, try using “[something(x) for x in foo]”
“”"
I don’t believe this is true at all. I have always heard that list comprehensions are simply syntactic sugar for the equivalent for loop, with maybe some allocation-time savings if the size of the list can be guessed. I tried your specific example and measured NO benefit to using list comprehensions for that case. I’d love to be proven wrong, though.
Python is much slower than C, and compiled BASIC with the most popular BASIC compiler is relatively close to C in speed. You only notice the difference rarely, maybe 5% of all programs where it makes sufficient difference that you can even tell how slow it is. This program is one of those. No big deal — even if you notice a speed difference, it doesn’t mean that Python is not the preferred tool. Suppose a program like this runs in 1 minute if written in some bad language and takes 20 minutes if done in Python. Python costs you 19 minutes of run time, but CPU time on a desktop is the closest thing to free that we have. The most expensive scenario is if you spend the 19 minutes watching the program run, and it costs you 19 minutes of people time. Didn’t python save you 19 minutes writing it?
Do the math. Figure out, based on what the total lifetime run time of this program will be, how much the total run time goes up from using python. (I figure that research calculations like this are not re-run many times.) Try to guess how much people time you will save writing and testing this program in python. Figure what the advantage is in time saved of staying with 1 high-productivity language as much as possible.
Al:
Python of course is slower on a per instruction level then C, period. If you’re not happy with that, use pyrex. If you’re not happy with that use swig/boost-python. If you’re still unhappy read along.
The benefit of python is not onyl implementation time alone, that’s wrong arguing.
The benefit of clear, concise, uncluttered and short code is time-savings on implementation and maintenance, less bugs and greater speed.
Any sufficently large problem is hard to express in C or Assembler alone. Thus projects implement half of a lisp-parser to express the domain problems. By that time they’ve bought in slowness of a parser and uglyness of a low-level language coupled with a half-bred language. Congrats to that.
What you have to realize is that a big factor for speed is algorithmic intelligence. And when your algorithms are expressed as clearly and shortly as possible, it is much easier to archive this goal.
You’re calling random.seed() every time you instantiate a drunkard. Unless you’re running python 2.4 and up, and know that your OS’s source of random numbers is good, you should only call seed() once. Mac OS X’s source is bad (probably the system time), leading to very un-random results. This may have something to do with your scatter mismatch.
Another simple idea that results in a big speed-up is to simply get rid of the RandomWalk class. In the code as given, the RandomWalk class does nothing more than inefficiently maintain a single variable as a list. Replacing the RandomWalk object with an integer variable and in-lining the walk and arrested method result in a run-time of about 1.5s, compared to 15s for the original program. Including psyco gets it to under a second.
As mentioned before, refactoring the code a bit to make it work with psyco will speed it up quite a bit. Move the psyco import into the body of the module, rather than in the if __name__ conditional, and move the main loop into its own function. This should give a 300% improvement in speed.
Also, move some of the array initilizations into list comprehensions, replace conditionals with inline funtions where possible, avoid randint() (use random() instead), and dereference array variables once when used multiple times in a loop. Those changes will give you another 30% or so improvement.
How fast was the BASIC version?
None of the original code (Python) seems available when I click on the links. Internal server error.
Thanks Kirby, this server executed Python unlike my old hosting provider. The files are now accessible again.
If math speed is your primary concern, a dynamic high-level script language like Python is not a good choice. You can optimize until your blue in the face. Why? Make a Python C module out of the critical code and let Python do the rest.
People keep claiming that Python is slow, but more productive than other languages. This is not true; I mean, it is true that Python is slow, but not true that it is more productive than other languages. For instance, Clean is very productive (if you use things like list comprehension, and array comprehension). The same can be said about OCAML. Notwithstanding, these languages are very fast, almost as fast as C, and faster in a few cases. They are faster than Fortran, for instance. Clean is very good to detect errors too. Errors are detected during compilation, which is much better. Besides this, the claim that Python programs are concise is greatly exagerated; for instance, the program mandelbrot, in the Computer Language Benchmarks Game, takes 32 lines in Python, and 31 lines in Clean; therefore, the programs have roughly the same size. But the program in Clean is 65 times faster. The recursive program has also the same size in both languages, but it is 177 times faster in Clean. Python beats Clean only in the amount of memory used. In fact , Python uses much more memory than Clean; however, there are experts that claim that languages should use little memory; if they are right, then Python will be defeated in all items.
i am not really a programmer, but i think python is really slow. zope/pone is written in python. i think Google Apps is written in python and Google Apps is not fast.
facebook and mediawiki are written in PHP and they are fast!
ASP is sloooow
Phython is slooow
Ruby is sloow
Java is slow
PHP is okay
C/C++ is superfast!
Of course Python is slow, and it will remain substantially slower than compiled languages, no matter how you tweak the code. Also notice how all the above suggestions detract from Python’s purported strength - productivity. Instead of focusing on your algorithm, they urge you to try all kinds of tweaks and focus on implementation details.
Numeric Python is great for its scope but not so much for its speed.
Hi Mike
Of course, Python, being an interpreted language, will always be slower than compiled languages. Everybody knows that.
The PROBLEM is Python is slower than PHP.
Just go to any Python website and you will know.
An example is:
http://www2.ljworld.com/
And this site is created by the creators of Django!
And it is not just this Python site that is slow. There are many many Python sites which are very slow. And please don’t say that it could be the web hosting or the server which is slow — because when so many Python sites are slower than PHP sites, it couldn’t be the web hosting.
Also, Zope/Plone is even slower.
Python is slow. Very slow.
Sure, Python is definitely a more advanced and powerful language than PHP. But what’s the point if it is slow. And contrary to what Python fans like to say, Python is not an easy language to program in. And it is definitely not easier to write Python web applications as compared to PHP.
Python sucks. Python is slow.
@Ronpaulite, this isn’t the forum for abuse, this is for arguments. Saying “Python sucks. Python is slow” isn’t useful.
Save it for the political forums. In this case providing evidence is how you have an argument.
For web development there is plenty of evidence that Django/Python is very fast. Now, if you’d said that Django/Python ate up alot of server memory, you’d be on firmer ground since mod_python is notorious for consuming memory (mod_wsgi and particularly nginx/fast-cgi are much improved)
The good about Python is that it is one of the current sweet spots between simplicity and conciseness, being a mainstream language, and having a large library (Batteries included). A lot of functional languages may be better, but they aren’t mainstream yet.
But it is slow even for an interpreted language; also, the real Good Thing(tm) about a good JIT compiler is that manually inlining function becomes totally useless. Said another way, Python sucks for fast code because it forces you to trade maintainability for performance.
@ron_paulite, for your one-but-last post:
> i think Google Apps is written in python and Google Apps is not fast.
I bet they use their own toolkit, GWT, which generates server-side Java and client-side Javascript. And the client-side is what was slow until the new buzz about Google Chrome/V8, TraceMonkey and so on.
> ASP is sloooow
Yep, it doesn’t even have bytecode, it interprets the source.
> PHP is okay
> Java is slow
That’s complete bullshit. PHP is interpreted, Java can get close to C speed. There are benchmarks arguing it’s actually faster, and while they were refuted, Java is still quite optimized.
The only problem of Java is horribly slow startup, bigger memory impact, and the fact that the main graphic library, Swing, is written in pure Java.
If you think to EJB, also, they’re simply quite more bloated than anything you can do in PHP. You’d need to compare PHP to JSP for speed - I’ve not seen benchmarks on this, but I still expect PHP to be slower.
I do not find Python easy to learn. And I find it slow. You guess you have an easy language to do the work; and you find it is slow and you must add a lot of things (numpy, scipy…) if you want a minimum of velocity. Slow and complicated. Python is a lie.
For scientific work, Fortran or C must be used.
Sorry for my English.
Bye.
@Blaisorblade
Ok, i got a little carried away. yes, php isn’t fast. but it is a very simple language to earn — and some complain a very messy language.
python isn’t easy to pick up. yes, it is very powerful. i spent several days reading and learning it. yes, python is definitley more powerful and elegant than PHP, with its OOP, lists, lambda function, etc., etc.. very exciting and promising! but when i tried out the sample scripts (web apps), i felt ripped off - what? all the elegance and power, and you get this kind of speed? is python meant to be a joke?!
Leave a Comment