🔲 Javascript Array: performance tips
Use Javascript arrays in a better way
Javascript - Array - V8 - Engine - 2020-07-23Hey guys, this post is inspired by a V8's team blog post called "Elements kind in V8".
Writing Javascript code holds a bunch of advantages. One of them is the fact that you don't need to worry about types and allocations.
// create a new array
const array = []
// add a string in it
array.push("yo")
// add a number in it
array.push(3)
// add another array
array.push(["nest"])
// who is complaining? NOBODY!
What does it cost? You could tell me: "It doesn't matter! Javascript engine handles that for us". It's true, but roughly, Javascript is interpreted by a runtime engine (called V8). Behind this engine, you'll find another language (C++). So we can imagine that behind the scene the engine creates a data structure that holds the array’s data, allocate and free memory when it's necessary.
Arrays behind the scene
Disclaimer: all the following stuff’s are true for V8 engine.
Just from your browser's console, you can figure out that Arrays are nothing else than Objects.
console.log(typeof [])
// -> print "object"
Objects
can be seen as properties that map to values. From an Array
point of view, those properties are just integers. V8 optimizes objects that got integer as properties name. They are not stored as other properties do.
[...] V8 chooses to store them separately from non-numeric properties for optimization purposes. [...] > Mathias Bynens
Internally, V8 calls those simple numeric properties indices and the values, Elements.
There are two ideas about Arrays
that every developer should keep in mind writing javascript. The first is the kind of Elements in the Array.
Elements kinds
Regarding the type of Elements
you put in the Array
, it handles operations differently.
At the engine level you'll distincts different elements kinds:
- Integer (called small integer internally)
- Double (big integers, float,
Nan
,Infinity
) - Elements (all the rest that cannot be represented as integer or double)
(We mention three here, but there are many others)
You can switch for one kind to another at runtime:
// Note: all the enums here are V8 internal,
// you'll never deal with, in your coder journey
const myArray = [1]
myArray.push(2) // from PACKED_SMI_ELEMENTS enum
myArray.push(2.3) // to PACKED_DOUBLE_ELEMENTS enum
myArray.push("js") // to PACKED_ELEMENTS enum
But when you switch from PACKED_SMI_ELEMENTS
one to a PACKED_DOUBLE_ELEMENTS
one you can't backward, then you lose all optimizations related to your previous elements kind.
The second idea that a Javascript developer should keep in mind: the Javascript engine distinguishes PACKED
and HOLEY
Arrays.
HOLEY Arrays
In the previous code sample, I mention a few enums used in the V8 engine like PACKED_SMI_ELEMENTS
. The term "PACKED" describes a dense array, an array without holes.
How can you create holes in an array?
const newArray = [1, 2]
newArray[5] = 7 // now indices [2] to [4] are holes,
[...] V8 makes this distinction because operations on packed Arrays can be optimized more aggressively than operations on holey Arrays. For packed Arrays, most operations can be performed efficiently. [...]
From an engine perspective, by creating holes, you'll switch the Element's kind:
const newArray = [1, 2] // kind: PACKED_SMI_ELEMENTS
newArray[5] = 7 // kind: HOLEY_SMI_ELEMENTS
Each Elements
kind we've seen come in two versions: a "PACKED
" one and a "HOLEY
" one. As Elements
kind, switching from PACKED
to HOLEY
is downward only.
So when you're switching, you're losing optimizations from the previous kind (again).
Performance tips
I list here different performance tips regarding these two points:
1 - Avoid element kind transition
Try to stick to one kind as much as possible when you use Arrays. For example, you should know that -0
, Nan
& Infinity
are store as DOUBLE:
const array = [3, 2, 1, +0] // PACKED_SMI_ELEMENTS
array.push(-0) // PACKED_DOUBLE_ELEMENTS
const array2 = [1, 2, 3] // PACKED_SMI_ELEMENTS
array2.push(NaN, Infinity) // PACKED_DOUBLE_ELEMENTS
Solution: by normalizing -0
 and blocking NaN
 and Infinity
 when initializing the values you'll stick to the  PACKED_SMI_ELEMENTS
 kind.
2 - Avoid array-like object
Let's write a bit of code:
const arrayLike = {} // Initialize the array-like
arrayLike.length = 0 // set length
arrayLike[0] = "a"
arrayLike.length++
arrayLike[1] = "b"
arrayLike.length++
arrayLike[2] = "c"
arrayLike.length++
What we got here? We try to create an array manually. How? We create an object we initialize a length property then, we happen key values pairs within, where keys are integer and values are string. And it looks like an array.
Now let's call the forEach method from Array.prototype
to the arrayLike
:
Array.prototype.forEach.call(arrayLike, (value, index) => {
console.log(`Array - ${index}: ${value}`)
})
Solution: Use builtins Arrays. This code works! But it will be slower than the forEach called an a proper array.
3 - Avoid polymorphism
In the case, you don't use builtins array methods (if you really have to), be careful that your code makes operations on a single element kind.
// A function that loop on an Array and call a function on each item
const customForEach = (array, callback) => {
for (let index = 0; index < array.length; ++index) {
const item = array[index];
callback(item, index);
}
};
// a basic log function
const log = (item, index) => console.log(`${index} - ${item}`
customForEach(['a', 'b', 'c'], log)
What's happen here? V8 caches* that customForEach
function is called with PACKED_ELEMENTS
. If we call forEach function a second time, V8 will check the element kind first:
- If it's the same than what is stored in the cache: V8 will reuse some generated code from the previous call
- If it's not: V8 had to make extra work
_ V8 cache is called IC or Inline Cache_
customForEach([1.1, 2.2, 3.3], log)
// The function is called with `PACKED_DOUBLE_ELEMENTS`
// V8 add a new kind in the cache
// and it marks array.length & array[index] as polymorphic
For every
customForEach
call from now on, yet another elements kind check is needed to re-use the generated code for `PACKEDSMIELEMENTS`. This comes at a performance cost.
Solution:
- If you have to use polymorphism, try to stick to builtins Arrays, they are optimized for.
- Avoid polymorphism ^^
4 - Avoid holes
In the introduction, we saw that if an Array is marked as HOLEY
you can go back or switching to a PACKED
Array.
const newArray = new Array(4) // marked as HOLEY_SMI_ELEMENTS
// -> [empty × 4]
As this Array is marked as HOLEY
it cannot change for ever, even if we fill the array manually:
newArray.push(1)
newArray.push(2)
newArray.push(3)
newArray.push(4)
// -> [1, 2, 3, 4]
Behind the scene newArray
stay HOLEY_SMI_ELEMENTS
even there are no more holes within.
Solution:
const goodWay = [] // init empty array
goodWay.push(1) // push what you need :)
goodWay.push(2)
goodWay.push(3)
I keep this one as the last example as the original post mentioned that for "real-world coding patterns" this performance tips "is usually too small to matter or even be measurable".
Why should we care about the engine?
As Javascript is a memory managed programming language, we could use it without caring about this. The first time I realized that It could be nice to consider is when I read this post from Fodor Indutny which mentions the fact that we should "Avoid Polymorphism", or "Cache and Reuse".
In fact, if you're starting your learning path to be a developer, you could think that it's not for you. You'll probably think that learning a popular framework and use some popular node packages will be more efficient, and it's probably true.
My point is: with this kind of knowledge about Javascript you could get a strong foundation earlier. Then, coding on a daily basis with these tips will be easier when you start your career than after 4 years of coding for example.
Hope you find this post relevant. Feel free to hit me on twitter if you want to.