Insertion Performance Benchmarks

2013-07-05 - One of my favourite data structures in C is the ordered vector (or array, whatever you call them). Incredibly simple to implement, very low memory overhead, and can provide O(log n) lookup with a simple binary search. However, ordered vectors have one very weak point: insertion and deletion of items is O(n). For small n that doesn’t really matter, but if the number of items in the list can grow a bit, you may run into performance issues. If you’re not careful, this could even turn your ordered vector into an attack vector (apologies for the terrible pun).

My goal with this benchmark is to get a feeling on how, exactly, insertion performance behaves with an ordered vector. What values of n are “small”? And how much worse does insertion performance get compared to more complex data structures?

For comparison, I chose the B-tree and hash table implementations from klib (from commit fff70758, to be precise). My goal wasn’t to benchmark the performance of different implementations, so I simply chose two implementations that I suspect are among the fastest. The vector implementation in the benchmarks is my own creation: vec.h from the Globster code base.

Source code: ins-bench.c

Best case & worst case

For a start, I decided to benchmark the best and worst case performance of inserting elements into a vector. The best case happens when inserting all items at the end of the vector, the worst case when inserting them in front. The B-tree and hash table benchmarks provided for comparison have all items inserted in order.

I’m cheating here with the vector implementation, because all elements are inserted in the list without first finding out the position with a binary search. Actual performance will be thus be a bit worse, depending on whether the final application needs that binary search or whether it can assume its input to be already sorted.

Gnuplot script: (The awk(ward) part can likely be done natively in gnuplot as well, but I was too lazy to figure out how)

set terminal png size 1000, 1500
set output "bench.png"
set logscale xy
set xlabel "number of items"
set ylabel "average time per insert (ms)"
set grid mxtics xtics mytics ytics
plot "< awk '{print $1, $2/$1*1000}' bench-vec" title 'vector, worst case',\
     "< awk '{print $1, $2/$1*1000}' bench-best" title 'vector, best case',\
     "< awk '{print $1, $2/$1*1000}' bench-hash" title 'khash',\
     "< awk '{print $1, $2/$1*1000}' bench-btree" title 'kbtree'

Average case

For the second benchmark I inserted values created with rand(), which should be a more accurate simulation of some real-world applications. This time I’m not cheating with the vector implementation, a binary search is performed in order to insert the items in the correct location.

set terminal png size 1000, 1500
set output "bench-rand.png"
set logscale xy
set xlabel "number of items"
set ylabel "average time per insert (ms)"
set grid mxtics xtics mytics ytics
plot "< awk '{print $1, $2/$1*1000}' rand-vec" title 'vector',\
     "< awk '{print $1, $2/$1*1000}' rand-hash" title 'khash',\
     "< awk '{print $1, $2/$1*1000}' rand-btree" title 'kbtree'

Benchmarking setup

All benchmarks were performed on a 3 GHz Core Duo E8400 with a 6 MiB cache. Compiled with the Gentoo-provided gcc 4.6.3 at -O3, linked against glibc 2.15, and run on a Linux 3.8.13-gentoo kernel. Boring details, but somehow good to document.