XML / JSON parsing trick #speed

Lately I needed to put my hands on performance issues in one project - I’ve rummaged through NewRelic a bit and noticed that the most time consuming actions were those that used external API requests. At first glance, I’ve thought the external API is responsible for the mess but well - breakdown shows in the controller request that overall lasts 1 second, the external API response is 48ms. All the rest is XML parsing.

After a quick examination it turned out that the project uses an old XML parser that is even slower than polish trains.

I’ve done a tiny benchmark of existing XML parsers (gem + engines) based on 4800-lined XML and the results (XML parse directly to Hash) looks like this:

As you can see, in this case Ox is much speedier that other parsers. I’ve also benchmarked existing JSON parsing engines, but using only 'multi_json' gem, which resulted in:

Oj parser seems to be the fastest one for JSON.

Keep in mind that some XML parsers, parse a document differently, for example:


MultiXml will transform to  <”xml:test”=>{“currency”=>”dollars”, “__content__”=>”300”}

and XmlSimple to "test"=>[{"currency"=>"dollars", "content"=>"300"}]

Hope you found the trick useful!

On 04.07.2013 in dev