Stop making stupid benchmarks

By Caleb Gardner

Written on: 2024-11-26

Updated on: 2025-01-12

Edit: Between writing this and fixing my markdown converter, ThePrimeagen and Casey Muratori did a fantastic breakdown on this particular benchmark and how it's not actually benchmarking for loops, but in fact it's benchmarking the modulo function. You can find the video here. If you don't want to watch the nearly hour and a half video, just know that Casey Muratori was able to improve the performance of the C code by 3.5x fairly easily.

Edit 2: The same person made another micro benchmark on the Levenshtein distance and it had a HUGE issue that cause Fortran to be significantly faster then it should have been. See Casey Muratori show why, once again, these benchmarks are garbage. Yes, It's Really Just That Bad.

I've recently gotten into the bad habit of looking at software dev twitter (no I'm never calling it X) and have been constantly annoyed at the amount of artificial benchmarks people share. The latest one to draw my ire (and spawn this post) is a bad "benchmark" that's basically just 1 BILLION iterations of a for loop.

More languages, more insights!

A few interesting takeaways:

Java and Kotlin are quick! Possible explanation: Google is heavily invested in performance here.
Js is really fast as far as interpreted / jit languages go.
* Python is quite slow without things like PyPy. pic.twitter.com/GIshus2UXO
— Ben Dicken (@BenjDicken) November 25, 2024

Now I cannot talk to most of the languages shown, but I have significant experience in Go and have spent a not insignificant time optimizing Go code (in particular my squashfs library). The second I opened up the code for this "benchmark" I knew that whoever had written this code has never tried to write optimized Go code. First let's start with the results without any changes. For simplicity I'll only show the results of C and Go.

C = 1.29s
C = 1.29s
C = 1.29s

Go = 1.51s
Go = 1.51s
Go = 1.51s

This is fairly expected, as it's what's in line with the post and what is logical, Go's structure is fairly low level and similar to C, but it is garbage compiled meaning it will be slower in real world applications. Now let's look at the results of my optimized code:

C = 1.29s
C = 1.29s
C = 1.29s

Go = 1.29s
Go = 1.30s
Go = 1.29s

Suddenly, C's lead is gone! What black magic is this???. Well, if you actually look at the original code and you know Go, you'll probably notice it immediately: the "benchmark" is using int. That's right, my optimizations boiled down to making all int instances int32s. I'm honestly a bit surprised it basically ties C, but I suspect that, since this isn't a real world benchmark, the garbage collector never actually has to do anything, meaning Go's primary disadvantage is non-existent.

My gaps in knowledge

Let me be clear, I am no expert, I do not actually know why int32 is faster then int, I just know it is (I have theories, but that's all they are). Though I know many of the other languages, I haven't ever done any research on how to optimize them. It's possible all the other languages are perfectly optimized, but the fact such a simple optimization was overlooked invalidates the entire test in my mind.

The Point

Let me be clear, benchmarks are important and useful, but the most useful benchmarks I've seen are between code of the same language as it removes a lot of the compiler magic and skill issues. Funnily enough, my benchmark between the code using int and int32 is a useful benchmark. The problem arises when you try to benchmark between fundamentally different languages (or even frameworks), but do not give them all the same amount of time and attention. As an example, if I were to write the C code for this test we'd probably see Go with a lead, not because Go is faster, but because I know how to write optimized Go.

The real world is messy, and between DB calls, API requests, and IO, the actual performance gains/failure of any particular language becomes a lot more complex and their performance will largely depend on your needs. The vast majority of the time spending time optimizing code would be far better then re-writing in a different language. The only time I'd actually recommend switching languages is when you've already optimized and are still running into performance constraints or if you want to learn. Let me be clear: Ben Dicken is a better engineer then me, but that doesn't mean he can't be misled and mislead others.

About the author:

Caleb Gardner

I love any thing to do with computers, from building them to programming them, it's been a passion since I was a child. My first foray into programming was on my Casio fx-9750GII graphing calculator in 5th grade after reading the user manual. Somehow, it would take me years to realize that I was programming.

Darkstorm.tech

A nerd doing nerd things