Saturday, 18 September 2021

Chapter 20 // Exercise 20 - Principles & Practice Using C++

In this exercise I am using Visual Studio 2019 and a modified version of the std_lib_facilities header found here.

Chapter 20 // Exercise 20

Run a small timing experiment to compare the cost of using vector and list. You can find an explanation of how to time a program in section 26.6.1. Generate N random int values in the range [0:N). As each int is generated, insert it into a vector<int> (which grows by one element each time). 

Keep the vector sorted; that is, a value is inserted after every previous value that is less than or equal to the new value and before every previous value that is larger than the new value. Now do the same experiment using a list<int> to hold to the ints. For which N is the list faster than the vector? Try to explain your result. This experiment was first suggested by John Bentley.


I really enjoyed this exercise; I love performance related programming (that's why it's my job). There's nothing quite like watching numbers go smaller.

All the results are in seconds.

1,000,000 ints:
Vector<int> { 31.82, 31, 31, 31, 31 }
List<int>{ 10+ minutes, I then gave up and cancelled it because I'm lazy }

100,000 ints:
Vector<int> { 0.816,  0.818, 0.817, 0.816, 0.818 }
List<int> { cancelled at 6+ minutes }

10,000 ints:
Vector<int> { 0.057, 0.054, 0.054, 0.054, 0.054 }
List<int> { 7, 7, 7, 7,  }

1000 ints:
Vector<int> { 0.005, 0.005, 0.005, 0.005, 0.005 }
List<int> { 0.086, 0.085, 0.085, 0.085, 0.085 }

100 ints:
Vector<int> { 0.0007644, 0.000715, 0.0005206, 0.0005139,  0.0000.5122 }
List<int> { 0.001, 0.001, 0.001, 0.001, 0.001 }

For inserting into the sorted vector/list I used this:

For the first experiment I was expecting the list to be slower but not that slow. It got to 3 minutes on the first test and I walked off to go do the hoovering. To be honest I was always expecting the vector to be faster due to the fact that elements in a vector are stored next to each other in memory so you get a nice cache win when iterating through vectors. With a list, the elements are all over the place so it will probably be destroying the cache.

It's even more interesting because the vector has to bump everything along with each insertion and increase it's memory allocation.

I was interested though on how to make the list faster. Especially because he wants to know which N the list is faster? I quickly implemented my own insert and that was surprising. At 1000 elements, the list was faster than the vector by about 0.8 seconds. At 10,000 the vector took 15 seconds and the list 7.8. After that I stopped testing because it got into the minutes for both containers.

So the moral of the story is; don't re-invent the wheel, use std::upper_bounds and a vector.

If you'd like to know more about the cache, please check out this excellent talk from Scott Meyers:

And with that I am FINALLY done with Chapter 20. I started it on April 7th and I procrastinated for 5 whole months.

No comments:

Post a Comment