Program size matters. Kind of
If you’ve been programming for a while, chances are that you’ve come across the following question: does the length of the variable names affect the code runtime performance in any way?
At first, that may seem like a silly question. After all, who in the world would choose ambiguous three-letter variable names instead of more descriptive and readable identifiers, except for old-school C developers? However, this is a legitimate question that opens up the possibility of learning some interesting concepts you may find useful in your software developer journey.
When a programming language source code gets compiled into machine code, the concept of variable start to lose its meaning. All that’s left of objects, classes, functions, and all these high-level concepts are just memory addresses, jump instructions, and simple operations. In particular, variables are translated into memory addresses and offsets.
This article will explore how the length of variable names can affect a program’s performance.
long_name. The former uses short one-letter variable names while the latter uses 16-kilobyte-long names. If there’s any speed difference, this large gap should make it noticeable.
For the benchmark, I used the
benchmark library for its high-resolution timers and statistically significant results. Note that here I replaced the 16-KB names with shorter ones so that you don’t have to download uselessly large files.
And here are the benchmark results:
Short name x 523 ops/sec ±0.95% (89 runs sampled)
Long name x 521 ops/sec ±1.00% (88 runs sampled)
The fastest option is Short name,Long name
To generate the V8 bytecode, I used
npx bytenode -c short_name.js
npx bytenode -c long_name.js
This generates two bytecode files:
short_name.jsc. Now, let’s print the file sizes with
du to see if there’s any difference. After all, the variable names should still be there.
du short_name.jsc long_name.jsc -h --apparent-size840 short_name.jsc
As you can see, there’s a huge difference in size: the latter is around 1,000 times larger. We should find the variables if we inspect the bytecode with a hex editor (I used the VSCode built-in one).
And indeed, the variable names are still there. However, symbol names (variables, functions…) are stored only once in the bytecode and then referenced using offsets and addresses instead of their ASCII names. As a result,
long_name.jsc is about a megabyte large: the length of the long
l variable. Also, the
i in the bytecode. I guess that’s an optimization.
If we call the function at the end of
long_name.js and perform the speed test using the
time utility, we should see a difference:
time node long_name.jsreal 0m0,132s
sys 0m0,021stime node short_name.jsreal 0m0,058s
And indeed, there’s a difference.
As for the previous benchmark, as I mentioned, there’s no performance difference because once loaded, the symbols are accessed via offsets and addresses like array elements, so there’s no reference to their names.
To wrap up, we’ve seen how the program size affects your code’s runtime performance. However, the examples in this article are extreme cases you probably won’t ever encounter during your career. Also, once the program is loaded, there won’t be any performance difference because of the way symbols are handled in the bytecode.
Moreover, you should never trade source code readability for smaller file sizes. An appropriately descriptive variable name is always better than a three-consonant word like old-school C developers used to do back when they used teletypewriters and paper (and some still do).
If you’re particularly keen on keeping the file size as small as possible, for example, for faster transfer over the internet, you can always take advantage of minification, the process of removing all unnecessary characters from the source code without changing its functionality, and thus lowering bandwidth usage and decreasing a website’s loading time.
I hope you enjoyed this article. If you have anything to add, please share your thoughts in a comment. Thanks for reading!