Several approaches to parse XML files

05 Jul 2012

Recently I had the chance to parse XML (in C++) using several libraries, I would like to make a note about it. There are many XML parsing libraries out there and I guess each one has its own pros and cons. You just have to pick the one that is most suitable for your use cases.

The first library I tried to use is boost property tree. This library is only introduced after boost version 1.40. It is very suitable for parsing small, simple structured XML files, like configuration files. However, boost property tree is still powerful enough to accomplish some complex parsing if you really want to, it’s just that the interface is not intuitive and convenient for you to do complicated things, since it is not designed solely for parsing XML data.

Here is a great blog post (containing some code examples) about using boost property tree.

A friend of mine introduced TinyXml to me, which is yet another XML parser. I’ve never tried it, but I presume it to be an easy-to-use DOM parser.

Finally, I settled with XSD XML data binding approach. This is an approach I’ve never heard of in school before. In this approach, XSD compiler will generate the object data model interface (c++ code) for your XML data, according to the XML Schema Definition file of your XML data. Every element, attribute in your XML data is a data type or class in the generated C++ interface; the interface provides setter and getter methods for every data type, so you access or modify the XML document through such interface.

The static data binding approach is very schema dependent, so if your XML data is changed in the future, your code has to be rewritten against a new generated interface. This is the tutorial for using the generated interface, and this is the manual of the command line tool for generating the C++ interface. Also, some code examples of using the generated interface.