Web development with passion
Handling big amounts of data between a client application and a REST service can sometimes be tricky, especially when things should be responsive. I did a small benchmark between the built-in tools in PHP: DOMDocument and XmlReader.
DOMDocument is maybe the most used method for XML handling and creation in PHP. The approach used in DOMDocument, is to load nodes into a DOM, which then can be queried and/or represented as a string.
Here's a sample script I used for benchmarking:
<?php
$start = microtime(true);
header('Cache-Control: no-cache, must-revalidate');
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header('Content-type: text/xml');
$doc = new DOMDocument('1.0', 'utf-8');
$doc->formatOutput = true;
$products = $doc->createElement('Products');
for ($i = 0; $i < 1000000; $i++) {
$element = $doc->createElement('Product');
$element->setAttribute('Id', $i);
$element->setAttribute('Value', md5(uniqid()));
$products->appendChild($element);
}
$doc->appendChild($products);
echo $doc->saveXml();
$stop = microtime(true);
$seconds = $stop - $start;
echo "Start: " . $start . PHP_EOL;
echo "Stop: " . $stop . PHP_EOL;
echo "Seconds: " . $seconds . PHP_EOL;
echo "Memory peak: " . memory_get_peak_usage() / 1048576 . 'MB' . PHP_EOL;
?>
This script was served via Apache and downloaded with wget.
Here are some stats:
XmlWriter is not that well known amongst PHP-developers. It's basically a wrapper for libxml's xmlWriter API. With XmlWriter you can (in theory, at least) flush the contents to the output-pipe before you have even processed all data on your server. That way you can start processing the data clientside before the download is even complete. The processing and 'editing' capabilities are a bit more limited than with DOMDocument, and you need to handle the document hierarchy yourself.
Here's a sample I wrote:
<?php
$start = microtime(true);
header('Cache-Control: no-cache, must-revalidate');
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header('Content-type: text/xml');
$writer = new XmlWriter();
$writer->openURI('php://output');
$writer->startDocument('1.0', 'utf-8');
$writer->startElement('Products');
$writer->setIndent(true);
for ($i = 0; $i < 1000000; $i++) {
$writer->startElement('Product');
$writer->writeAttribute('Id', $i);
$writer->writeAttribute('Value', md5(uniqid()));
$writer->endElement();
}
$writer->endElement();
$writer->flush();
$stop = microtime(true);
$seconds = $stop - $start;
echo "Start: " . $start . PHP_EOL;
echo "Stop: " . $stop . PHP_EOL;
echo "Seconds: " . $seconds . PHP_EOL;
echo "Memory peak: " . memory_get_peak_usage() / 1048576 . 'MB' . PHP_EOL;
?>
This script was as well served via Apache and requested with wget.
Here are some stats on XmlWriter performance:
So to sum things up. If the capabilities XmlWriter offers are enough for you, you should definitely be using it. With one million 'product' items, the times to process the data and get the contents to your client will be split in half comparing to DOMDocument. And what's most important: you will be consuming only a fraction of the memory, which then leads to better performance and scalability.