Working with large XML in PHP
Write a class that will generate and save an xml document to disk with the following structure:
<?xml version="1.0" encoding="UTF-8" ?>
<root>
<serverType>server</serverType>
<entry>
<id>{{fileId}}</id>
<type>text</type>
<name>{{fileName}}</name>
<text>{{fileContent}}</text>
</entry>
</root>
Where:
{{fileId}} - unique id (must be generated)
{{fileName}} - file name
{{fileContent}} - file contents
领英推荐
We implement a streaming approach:
<?php
class XmlGenerator {
private $filePath;
private $writer;
public function __construct($filePath) {
$this->filePath = $filePath;
$this->writer = new XMLWriter();
}
public function generateXml() {
if (!file_exists($this->filePath)) {
throw new Exception("File not found: " . $this->filePath);
}
$outputFilePath = __DIR__.'/output.xml'; // Specify the path where to save the file
$this->writer->openUri($outputFilePath);
$this->writer->setIndent(true);
$this->writer->startDocument('1.0', 'UTF-8');
$this->writer->startElement('root');
$this->writer->writeElement('serverType', 'server');
$fileId = uniqid();
$this->writer->startElement('entry');
$this->writer->writeElement('id', $fileId);
$this->writer->writeElement('type', 'text');
$this->writer->writeElement('name', basename($this->filePath));
$handle = fopen($this->filePath, 'r');
while (!feof($handle)) {
$chunk = fread($handle, 8192); //Block size
$this->writer->writeElement('text', $chunk);
}
fclose($handle);
$this->writer->endElement();
$this->writer->endElement();
$this->writer->endDocument();
$this->writer->flush();
return $outputFilePath;
}
}
$filePath = __DIR__.'/4gb.txt';
$generator = new XmlGenerator($filePath);
try {
$outputFilePath = $generator->generateXml();
echo "XML the file was successfully created and saved to: $outputFilePath";
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
Class that reads large xml:
<?php
class Reader {
private $filePath;
private $reader;
public function __construct($filePath) {
$this->filePath = $filePath;
$this->reader = new XMLReader();
}
public function readTextElement($index) {
if (!$this->reader->open($this->filePath)) {
return null;
}
$fileContent = '';
$textCount = 0;
while ($this->reader->read()) {
if ($this->reader->nodeType == XMLReader::ELEMENT) {
if ($this->reader->name == 'text') {
$textCount++;
if ($textCount == $index) {
$this->reader->read();
$fileContent = $this->reader->readString();
break;
}
}
} elseif ($this->reader->nodeType === XMLReader::END_ELEMENT && $this->reader->name === 'entry') {
break;
}
}
$this->reader->close();
return $fileContent;
}
}
$filePath = __DIR__.'/output.xml';
$xmlReader = new Reader($filePath);
$fileContent = $xmlReader->readTextElement(524278);
if ($fileContent !== null) {
echo $fileContent;
} else {
echo "Item not found";
}
Tested with a size of 4 gigabytes.
Lead Backend Developer
1 年Hi did you tested larger files to check how script works in highload? For example 100gb file.