Tuesday, February 17, 2009

A bug of the non-obvious kind

Just when you think you know your language, it hits you by surprise. This little harmless looking example has a bad bug:

boost::char_separator<char> sep(":", "", boost::keep_empty_tokens);
typedef boost::tokenizer<boost::char_separator<char> > tokenizer;

std::string str("a=foo:bar:baz");
tokenizer tokens(str.substr(2), sep);
for(tokenizer::iterator i = tokens.begin(); i != tokens.end(); ++i)
std::cout << "Token: '" << *i << "'" << std::endl;

Spotted it? Its right there where tokens() constructor gets passed a temporary object. boost::tokenizer keeps only a reference of the string, not a copy, so the temporary object barfs since it gets instantly destroyed after the construction call. Looking back to it, it makes perfect sense, but unless you know what you are looking for its pretty damn hard to spot, especially since the documentation doesn't exactly make this fact obvious and because the code used to work fine on my machine, it only barfed on a users machine with 64bit Linux. The correct fix is to do:

std::string tmp = str.substr(2);
tokenizer tokens(tmp, sep);

No comments: