Saturday, October 19, 2013

Performance tuning

You can spend a lot of time optimizing code. It's a rabbit hole that can stop you from ever reaching a working product. I almost got completely sucked down one when I noticed that simple changes made big differences in the performance of my B-tree. The latest commit has the ability to compare parameterizations of the B-tree based on insertion count and node size (order). Here's the performance of the current head on my i7 laptop:

[test] Random insertion 1000000 elements (order 16)
insertions took 5666420 us (176 K/sec)
iteration took 1216899 us (821 K/sec)
fetch took 3534220 us (282 K/sec)
[pass]: depth: 5 nodes: 7538 leaves: 83356

The "fetch" operation above is just fetching every node in the tree. For random elements I just re-seed the RNG and fetch in the same order I inserted them in. 

One method I use a lot is this one: 

func nodeFind (node *Node, value uint64) int {
pos := len(node.Values)
for i, k := range node.Values {
if value < k.Key {
pos = i;
break
}
}
return pos
}

This simply finds a key in my array of key/value pairs. It wasn't always written this way. This is how it used to be:

func nodeFind (node *Node, value *Pair) int {
pos := len(node.Values)
for i, k := range node.Values {
if value.Key < k.Key {
pos = i;
break
}
}
return pos
}

And this was the performance of that implementation: 

[test] Random insertion 1000000 elements (order 16)
insertions took 5715310 us (174 K/sec)
iteration took 1210612 us (826 K/sec)
fetch took 4331910 us (230 K/sec)
[pass]: depth: 5 nodes: 7485 leaves: 83217

Notice that fetches are much slower 230 K/s versus 280 K/s is 18% slower. The time spent in nodeFind dominates the fetch operation (that's basically all fetch does). 

No comments:

Post a Comment