Python large list performance

Performance of Numpy Array vs Python List

Cory Gough

Jul 22, 2019·4 min read

Most of us have been told numpy arrays have superior performance over python lists, but do you know why?

The following graph plots the performance of taking two random arrays/lists and adding them together. We can clearly see that this operation in numpy is practically the same for 100 elements and 10,000,000 elements. While python grows at a at fast rate! To begin understanding why, we must first take a quick look at how RAM works and how arrays are stored in RAM compared to lists.

More questions on how this graph was generated? Refer to the code.

RAM is our computers working memory. All of your processes from streaming YouTube to reading this article resides in RAM. You could visualize RAM as a bookshelf in your house. Each book is a memory location that stores a value such as an int, double, or reference to another location in memory[aka a pointer], but more on this later. Each byte has an address that allows for quick access to any point of data. Hence the name Random Access Memory.

Numpy arrays store one defined type of data and the number of elements is given up front . This is necessary because they are stored as one contiguous block of memory. Its like encyclopedias, all in alphabetical order on your bookshelf. This image represents an array of type 1 byte int and length 5. Knowing the starting memory location of the array, we can simply add the index to immediately jump to any element.

Refer to the python documentation.

Python lists are not stored like this. You dont have to specify how many elements you will append to your list nor do you have to define a data type. This is possible because Python lists are actually arrays of pointers. A pointer is a reference to an object in memory. When you retrieve the first element in your list, python is taking two steps: First, retrieve the pointer. Second, go to the memory location of the pointer to finally get the object you want. This extra lookup in memory contributes to the major performance decrease in lists.

Array being copied to CPU Cache.

It gets worse. Your CPU has a small amount of local memory known as a cache. To avoid making repetitive roundtrips to RAM, the CPU will retrieve and store blocks of contiguous memory in the cache. Imagine your nightstand is the cache and your bookshelf is RAM. When you go to get the first book in your set of encyclopedias, you decide to take them all back to your nightstand cache. After all, it is convenient because they are all next to each other. Now you can quickly access all of the books without having to get up! With lists, we do not get this benefit. The data is scattered all throughout your RAM and referenced by pointers. Each time we need a piece of data, we must get the pointer and then go look it up. It is as if our encyclopedias were scattered and we had to run around our mansion homes library looking for books!

The topic of memory and how each data structure is stored can get very dense very quickly, but hopefully this high level explanation gives you a mental image of what is happening under the python hood. The main takeaway is understanding the pros and cons of arrays vs lists and when to use which. Simply put, use arrays for large sets of data that are all of the same type. If you find the topic interesting, try researching how this can be done even faster on a GPU!

Video liên quan

Chủ Đề