Set vs. Vector: The Epic Battle

I was bored one day and started thinking about implementing something or other with C++ and found myself pondering on which container type to use: std::set or std::vector. As everyone who uses C++ knows, there's never a clear-cut simple answer to this question - so I decided to run some tests to help me decide.

For my hypothetical application, the purpose of the container in question was to track a set of values which would have to be tested frequently to determine if specific ones were present in the containter. My instincts told me that I should use a std::set for this purpose as that seems the best fit given the requirements. After pondering my idea some more, I realized that it would be highly unlikely for this container to ever have more than about 5 elements stored within it at a time - so I got to thinking: is std::set really the type I'd want in this case? The answer turns out to most likely be no. However it was somewhat surprising just how far you can go with a std::vector before the brute force searching becomes a problem.

The code I used for my test is available here as well as an ultra simple build script if you happen to use gcc.
My test machine was an Apple G4 PowerBook 1.67Ghz with 2GB of RAM running MacOS X 10.4.3 and using gcc version 4.0.1 (Apple Computer, Inc. build 5247). Your numbers will be different and the actual times don't really mean anything outside of the context of my machine - but the difference between the times is what is important. My build script generates versions for gcc optimization levels -O0(none) to -O3(max) as well as -Os (size). Results:

10 elements per container

test-O0

10 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       20.95 ms [winner]
Average per test for VECTOR:    31.39 ms
test-O1
10 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       0.98 ms
Average per test for VECTOR:    0.5 ms [winner]
test-O2
10 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       1.47 ms
Average per test for VECTOR:    0.63 ms [winner]
test-O3
10 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       1.14 ms
Average per test for VECTOR:    0.52 ms [winner]
test-Os
10 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       3.03 ms [winner]
Average per test for VECTOR:    3.13 ms

40 elements per container

test-O0

40 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       86.58 ms [winner]
Average per test for VECTOR:    341.27 ms
test-O1
40 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       3.87 ms [winner]
Average per test for VECTOR:    4.18 ms
test-O2
40 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       4.17 ms
Average per test for VECTOR:    3.83 ms [winner]
test-O3
40 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       4.17 ms
Average per test for VECTOR:    3.81 ms [winner]
test-Os
40 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       13.07 ms [winner]
Average per test for VECTOR:    33.79 ms

50 elements per container

test-O0

50 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       109.53 ms [winner]
Average per test for VECTOR:    513.1 ms
test-O1
50 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       5.19 ms [winner]
Average per test for VECTOR:    5.71 ms
test-O2
50 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       5.1 ms [winner]
Average per test for VECTOR:    6.2 ms
test-O3
50 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       5.42 ms [winner]
Average per test for VECTOR:    6.07 ms
test-Os
50 elements in each container.
Running 100 tests with 1000 cycles each.
Average per test for SET:       16.94 ms [winner]
Average per test for VECTOR:    50.96 ms

In conclusion, it appears that as long as optimization is used, there's a sweet spot around 40 or so elements where std::vector is actually faster than std::set for looking up integers. Of course there are many variables that factor into the resulting speed of these containers. Everything from the compiler and CPU used to the optimization levels to the data type to the STL implementation itself are going to influence these numbers, but it does appear that in some situations when performance is important and not many elements will be used, brute force linear searching may be faster than expected and should be considered if warrented.

back