Creative Compulsive: Performance tuning

You can spend a lot of time optimizing code. It's a rabbit hole that can stop you from ever reaching a working product. I almost got completely sucked down one when I noticed that simple changes made big differences in the performance of my B-tree. The latest commit has the ability to compare parameterizations of the B-tree based on insertion count and node size (order). Here's the performance of the current head on my i7 laptop:

[test] Random insertion 1000000 elements (order 16)

insertions took 5666420 us (176 K/sec)

iteration took 1216899 us (821 K/sec)

fetch took 3534220 us (282 K/sec)

[pass]: depth: 5 nodes: 7538 leaves: 83356

The "fetch" operation above is just fetching every node in the tree. For random elements I just re-seed the RNG and fetch in the same order I inserted them in.

One method I use a lot is this one:

func nodeFind (node *Node, value uint64) int {

pos := len(node.Values)

for i, k := range node.Values {

if value < k.Key {

pos = i;

break

}

return pos

}

This simply finds a key in my array of key/value pairs. It wasn't always written this way. This is how it used to be:

func nodeFind (node *Node, value *Pair) int {

pos := len(node.Values)

for i, k := range node.Values {

if value.Key < k.Key {

pos = i;

break

}

return pos

}

And this was the performance of that implementation:

[test] Random insertion 1000000 elements (order 16)

insertions took 5715310 us (174 K/sec)

iteration took 1210612 us (826 K/sec)

fetch took 4331910 us (230 K/sec)

[pass]: depth: 5 nodes: 7485 leaves: 83217

Notice that fetches are much slower 230 K/s versus 280 K/s is 18% slower. The time spent in nodeFind dominates the fetch operation (that's basically all fetch does).

Creative Compulsive

Saturday, October 19, 2013

Performance tuning

1 comment:

About Me