{ Arrays Revisited. }

Objectives

By the end of this chapter, you should be able to:

  • Describe how an array is stored in memory
  • Analyze the runtime of array operations such as push, pop, find, remove, and element access

Arrays: Under the Hood

An array is a popular data structure that is used frequently and incorporated into many programming languages. But how is data in an array stored on your machine? What is an array good at, and what is it not so good at? Before learning about linked lists, let's first explore arrays in more detail to help better understand how they compare to other data structures.

How Are Arrays Stored in Memory

Every piece of data in your program needs to be stored somewhere so that your program can access that data. The location of each piece of data in your machine is referred to as the data's memory address. You can think of memory addresses as bins to put our values in. Each slot in memory has a memory address, and each slot can hold data.

One of the most important defining characteristics about arrays is that they are stored in memory contiguously. This means that adjacent elements in the array have adjacent memory addresses.

Let's look at a simple example. Suppose you have an array like the following:

var arr =  [5,12,2,30,8];

We can think of memory addresses as defined by some numerical id. The fact that the array is stored contiguously means that it will be stored in memory like this (the top row corresponds to memory addresses):

3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516
.... 5 12 2 30 8 ....

You can consider memory access to be constant time in your program, so every time you look up an variable, the process is constant time. But what about an array? An array saves the memory address of where the array starts. In other words, the program knows where index O of the array is stored in memory. And since arrays are contiguous, accessing an element by an index is also constant time because looking up a value at an index is just doing some simple math with memory addresses.

For example, suppose you want to access the element at index 3 in the array. In this case, your computer knows that the element at index 0 is stored at memory address 3507; therefore, the element you're looking for must be at address 3507 + 3 = 3510.

No matter what element you're trying to access, your computer can always start at the beginning of the array, shift the memory address by some amount based on the index, and then access the element you requested. These are all constant-time operations.

TAKE AWAY: All array access is constant time, or O(1).

For example, arr[0] and arr[4] are both constant time operations that are very similar to simple variable lookups.

How Do Arrays Grow?

Let's look at our array example again:

var arr =  [5,12,2,30,8];

Storage in memory:

3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516
.... 5 12 2 30 8 ....

When memory is allocated to store an array, it is often stored on the heap. However, there are many other variables that are also being placed on the heap. Numbers, objects, strings, and other data types may all be on the heap as well. All of these different types of data will be stored in memory alongside the array we care about. So when the array is created, your JavaScript interpreter must decide how large to make the array's contiguous space in memory. Let's say in this example, the interpreter chose to allocate 9 memory slots. This means that as long as our array has fewer that nine elements in it, it can live in memory starting at address 3507.

Now, let's say the program does the following:

arr.push(3); // 6
arr.push(13); // 7
arr.push(14); // 8
arr.push(50); // 9
arr.push(35); // 10 -- longer than the allocated space in memory!

Everything is fine, until the line arr.push(35). At this point the JavaScript interpreter has run out of allocated heap space. So in order to complete the push, the interpreter must allocate more contiguous space (more than 9 slots), copy over every element of the array to the new memory addresses, and then put the new element (35 in this case) into the array.

So even though array access is constant time, the push operation can be O(n) if we run out of available space at the existing starting point in memory. Sometimes this operation is referred to as being amortized O(1). This is because JavaScript is setting aside a certain amount of space when the array is created, so that up to a point the push operation will be constant. But if you exhaust that available space in memory, the entire array will need to be copied somewhere else.

Big O Runtime of Common Operations

Now that we know how an array is stored, we can fully analyze the runtime of an array:

Operation Runtime
Accessing an element with an index O(1)
push / unshift O(n)
pop O(1)
find O(n)
remove (using splice) O(n)

Find is O(n) because in the worst case all elements in the array must be iterated over to find the element you are looking for.

Remove is also O(n) because an array must remain contiguous, so if an element is removed, all elements after it must be shifted down 1, which is a O(n) operation.

When you're ready, move on to Singly Linked Lists

Continue

Creative Commons License