Erlang Fractal Benchmark
While looking at a simple
fractal benchmark that
showed up on the programming Reddit, I
noticed that there wasn’t an Erlang version. Below
is one I wrote last night. Erlang fares rather well. One thing mildly
surprised me: it runs slightly faster in an Erlang shell within Emacs than
in both Apple’s Terminal and iTerm on Mac OS X. Within Emacs it runs in
1.09000 (runtime) 1.14100 (wall clock) seconds. In both Terminal and iTerm
it runs in around 1.11000 (runtime) 1.16600 (wall clock) seconds. Perhaps
screen I/O isn’t as fast in the terminal programs.Two caveats: first, these
numbers were generated on my 2.33 GHz Intel MacBook Pro; I don’t know what
the original benchmarks used. Also, I only ran the code a handful of times
and picked a “typical” time to report. A better test would have been to run
the code hundreds or thousands of times and average the values.This post
also says a bit about intuition vs. measuring. I discuss some code
modifications and their expected and actual effects below.Another thing to
note: the author of the fractal benchmark page says that he hasn’t bothered
to optimize the code for each language he tested. I don’t know if using
lists:map/2
or extracting iter_value/5
and using guard clauses would
disqualify this version in his opinion.
You might have noticed that ZI * ZI
and ZR * ZR
are calculated twice:
once in the body of the last clause and once in the second guard clause. The
guard clause has to be executed every time, which means that running the
last, most frequently executed clause executes the multiplications twice. I
tried pre-calculating those values and adding them as parameters, so the arg
list is
Did it help? It did indeed. Execution time in Emacs went down to 0.890000
(runtime) 0.919000 (wall clock) seconds. Here are the modified versions of
iterate
and iter_value
:
One final thing I tried was commenting out the calls to io:format
. In my
experience, screen I/O usually slows things down quite a bit. (In the case
statement in plot/2
, I had to replace them with void
statements instead
of simply commenting them out.) The result: execution time went down to
0.830000 (runtime) 0.839000 (wall clock) seconds within Emacs. In iTerm,
execution time was only slightly slower than that. So the time decreased,
but not nearly as much as it did when I removed the extra multiplications.
I’m surprised. Is multiplication that expensive in Erlang, or is I/O well
optimized, or was my instinct wrong? Come to think of it, io:format
is
only called once per coordinate; the multiplications happen thousands of
times for each.A distributed version of this algorithm is certainly possible
(say, one process for every X,Y coordinate). My gut tells me that it would
run slower because the calculation for an individual coordinate is
relatively small and the message passing overhead and gathering and
coordination of the results would outweigh the benefits. My intuition was
wrong about the effects of I/O, though. I’d have to try it to make sure.
Additions: In the comments, Ulf Wiger suggested that I add is_float
guards for all the function parameters that are floats. Doing this to
iter_value/7
reduced execution time by almost half to 0.450000 (runtime)
0.455000 (wall clock) seconds, a huge savings. (Does anybody know why adding
these guard clauses speeds up execution? I would imaging that extra checking
would slow it down.) Here’s the new code for iter_value/7
:
Ulf and others also suggested that I avoid printing each character separately. Doing so did not seem to change execution time at all. Here’s what I did:
I also tried commenting out the io:format/2
call above. Execution time
went up about 1/100 of a second.