tonygo_
NUIT

🔲 Javascript Array: performance tips

Use Javascript arrays in a better way

Javascript - Array - V8 - Engine - 2020-07-23

Hey guys, this post is inspired by a V8's team blog post called "Elements kind in V8".

Writing Javascript code holds a bunch of advantages. One of them is the fact that you don't need to worry about types and allocations.

// create a new array
const array = []

// add a string in it
array.push("yo")

// add a number in it
array.push(3)

// add another array
array.push(["nest"])

// who is complaining? NOBODY!

What does it cost? You could tell me: "It doesn't matter! Javascript engine handles that for us". It's true, but roughly, Javascript is interpreted by a runtime engine (called V8). Behind this engine, you'll find another language (C++). So we can imagine that behind the scene the engine creates a data structure that holds the array’s data, allocate and free memory when it's necessary.

Arrays behind the scene

Disclaimer: all the following stuff’s are true for V8 engine.

Just from your browser's console, you can figure out that Arrays are nothing else than Objects.

console.log(typeof [])
// -> print "object"

Objects can be seen as properties that map to values. From an Array point of view, those properties are just integers. V8 optimizes objects that got integer as properties name. They are not stored as other properties do.

[...] V8 chooses to store them separately from non-numeric properties for optimization purposes. [...] > Mathias Bynens

Internally, V8 calls those simple numeric properties indices and the values, Elements.

There are two ideas about Arrays that every developer should keep in mind writing javascript. The first is the kind of Elements in the Array.

Elements kinds

Regarding the type of Elements you put in the Array, it handles operations differently.

At the engine level you'll distincts different elements kinds:

  • Integer (called small integer internally)
  • Double (big integers, float, Nan, Infinity)
  • Elements (all the rest that cannot be represented as integer or double)

(We mention three here, but there are many others)

You can switch for one kind to another at runtime:

// Note: all the enums here are V8 internal,
// you'll never deal with, in your coder journey
const myArray = [1]

myArray.push(2) // from PACKED_SMI_ELEMENTS enum

myArray.push(2.3) // to PACKED_DOUBLE_ELEMENTS enum

myArray.push("js") // to PACKED_ELEMENTS enum

But when you switch from PACKED_SMI_ELEMENTS one to a PACKED_DOUBLE_ELEMENTS one you can't backward, then you lose all optimizations related to your previous elements kind.

The second idea that a Javascript developer should keep in mind: the Javascript engine distinguishes PACKED and HOLEY Arrays.

HOLEY Arrays

In the previous code sample, I mention a few enums used in the V8 engine like PACKED_SMI_ELEMENTS. The term "PACKED" describes a dense array, an array without holes.

How can you create holes in an array?

const newArray = [1, 2]

newArray[5] = 7 // now indices [2] to [4] are holes,

[...] V8 makes this distinction because operations on packed Arrays can be optimized more aggressively than operations on holey Arrays. For packed Arrays, most operations can be performed efficiently. [...]

From an engine perspective, by creating holes, you'll switch the Element's kind:

const newArray = [1, 2] // kind: PACKED_SMI_ELEMENTS

newArray[5] = 7 // kind: HOLEY_SMI_ELEMENTS

Each Elements kind we've seen come in two versions: a "PACKED" one and a "HOLEY" one. As Elements kind, switching from PACKED to HOLEY is downward only.

V8 lattice (Credits: v8.dev)

So when you're switching, you're losing optimizations from the previous kind (again).

Performance tips

I list here different performance tips regarding these two points:

1 - Avoid element kind transition

Try to stick to one kind as much as possible when you use Arrays. For example, you should know that -0, Nan & Infinity are store as DOUBLE:

const array = [3, 2, 1, +0] // PACKED_SMI_ELEMENTS

array.push(-0) // PACKED_DOUBLE_ELEMENTS

const array2 = [1, 2, 3] // PACKED_SMI_ELEMENTS

array2.push(NaN, Infinity) // PACKED_DOUBLE_ELEMENTS

Solution: by normalizing -0 and blocking NaN and Infinity when initializing the values you'll stick to the  PACKED_SMI_ELEMENTS kind.

2 - Avoid array-like object

Let's write a bit of code:

const arrayLike = {} // Initialize the array-like
arrayLike.length = 0 // set length

arrayLike[0] = "a"
arrayLike.length++

arrayLike[1] = "b"
arrayLike.length++

arrayLike[2] = "c"
arrayLike.length++

What we got here? We try to create an array manually. How? We create an object we initialize a length property then, we happen key values pairs within, where keys are integer and values are string. And it looks like an array.

Now let's call the forEach method from Array.prototype to the arrayLike:

Array.prototype.forEach.call(arrayLike, (value, index) => {
  console.log(`Array - ${index}: ${value}`)
})

Solution: Use builtins Arrays. This code works! But it will be slower than the forEach called an a proper array.

3 - Avoid polymorphism

In the case, you don't use builtins array methods (if you really have to), be careful that your code makes operations on a single element kind.

// A function that loop on an Array and call a function on each item
const customForEach = (array, callback) => {
  for (let index = 0; index < array.length; ++index) {
    const item = array[index];
    callback(item, index);
  }
};

// a basic log function
const log = (item, index) => console.log(`${index} - ${item}`

customForEach(['a', 'b', 'c'], log)

What's happen here? V8 caches* that customForEach function is called with PACKED_ELEMENTS. If we call forEach function a second time, V8 will check the element kind first:

  • If it's the same than what is stored in the cache: V8 will reuse some generated code from the previous call
  • If it's not: V8 had to make extra work

_ V8 cache is called IC or Inline Cache_

customForEach([1.1, 2.2, 3.3], log)
// The function is called with `PACKED_DOUBLE_ELEMENTS`
// V8 add a new kind in the cache
// and it marks array.length & array[index] as polymorphic

For every customForEach call from now on, yet another elements kind check is needed to re-use the generated code for `PACKEDSMIELEMENTS`. This comes at a performance cost.

Solution:

  • If you have to use polymorphism, try to stick to builtins Arrays, they are optimized for.
  • Avoid polymorphism ^^

4 - Avoid holes

In the introduction, we saw that if an Array is marked as HOLEY you can go back or switching to a PACKED Array.

const newArray = new Array(4) // marked as HOLEY_SMI_ELEMENTS
// -> [empty × 4]

As this Array is marked as HOLEY it cannot change for ever, even if we fill the array manually:

newArray.push(1)
newArray.push(2)
newArray.push(3)
newArray.push(4)
// -> [1, 2, 3, 4]

Behind the scene newArray stay HOLEY_SMI_ELEMENTS even there are no more holes within.

Solution:

const goodWay = [] // init empty array

goodWay.push(1) // push what you need :)
goodWay.push(2)
goodWay.push(3)

I keep this one as the last example as the original post mentioned that for "real-world coding patterns" this performance tips "is usually too small to matter or even be measurable".

Why should we care about the engine?

As Javascript is a memory managed programming language, we could use it without caring about this. The first time I realized that It could be nice to consider is when I read this post from Fodor Indutny which mentions the fact that we should "Avoid Polymorphism", or "Cache and Reuse".

In fact, if you're starting your learning path to be a developer, you could think that it's not for you. You'll probably think that learning a popular framework and use some popular node packages will be more efficient, and it's probably true.

My point is: with this kind of knowledge about Javascript you could get a strong foundation earlier. Then, coding on a daily basis with these tips will be easier when you start your career than after 4 years of coding for example.

Hope you find this post relevant. Feel free to hit me on twitter if you want to.

Back to articles
tony-gorez

Hi 👋 ! My name is Tony, Im a software developer. As software engineering is about learning again and again, I decided to put all the stuff that I'll learn here !