Blog Posts

Any missing XML features in PHP?

PHP 5.1 is just around the corner (at least RC 1) and some new, but not really groundbreaking XML features come with it (see my ApacheCon slides for more information).

Personally for me, the XML support in PHP 5 is quite feature complete and I don't have any pending TODOs for it. But maybe you have some ideas, what could be improved or what could be newly implemented. This is your chance and I'm really interested in hearing from you. Just leave a comment below. Maybe me or someone else will find the time to actually implement it for PHP 5.x :)

Related Entries:
- php xslcache extension by the New York Times
- Added xslt profiling to PHP 5.3 and 6 CVS
- Pimp up your XSLT transformation
- Profile XSLT transformations within PHP
- Added DOMNode::getNodePath

About the author

Comments [34]

Jonas, 27.07.2005 12:50 CET

Is there already a built in serialize function which puts the data into an XML File.

I already used some PHP based XML/Serialize classes but I would prefere a built in method. And of course it would be great, if I could use the serialized datastructure withing i.e. Java based applications....

chregu, 27.07.2005 12:54 CET

Jonas: You mean arrays and such? No. There isn't. And I don't see any advantages in doing that native compared to the PHP approaches out there.

Hannes, 27.07.2005 12:55 CET

What about XML DSIG?

chregu, 27.07.2005 13:03 CET

Hannes: Noted. But no idea when and if I'm gonna implement that. Anyone wants to step forward? :)

doof, 27.07.2005 13:07 CET

Hello, i'm french, so excuse me for my poor english.

I actualy develop a template system who use php for langage and i wanted to use DOMdocument for the core manipulation of interactives elements like forms or xul elements.

But it's for a template sytem and some include others, it's mostly for fragments of documents. And the thing who totally block me is that saveXML end saveHTML only save full document with doctype and <html> and nodes !

If i can specify a node i want to save, it would be perfect, saveXML('blok1') for save fragment of document where id is named 'block1' for example.

I hope you're understand me, best regards.

chregu, 27.07.2005 13:10 CET

doof: saveXML takes a node as an argument, if you only want to output that node.

see the docs for more details :)

Harry Fuecks, 27.07.2005 13:17 CET

Dragging the barrel - DTD / XML Schema / Relax NG inference?

http://www.mono-project.com/XML_Schema_Inference

doof, 27.07.2005 13:19 CET

hush !
"string saveXML ( [DOMNode node] )"

how it is possible that i never seen this !!??
maybe cause it's missing from saveHTML and i thinked it's same for saveXML.

terrible, thank you very much !

chregu, 27.07.2005 13:47 CET

Harry: Nice, but as long as libxml2 doesn't support it, we won't either. I don't see the resources in the PHP community to implement such a beast. We're already happy that libxml2 now has quite decent XML Schema support :)

chregu, 27.07.2005 13:51 CET

Harry: One more thing. Trang already can something similar. From their site:
***
Trang converts between different schema languages for XML. It supports the following languages:

* RELAX NG (XML syntax)
* RELAX NG compact syntax
* XML 1.0 DTDs
* W3C XML Schema

A schema written in any of the supported schema languages can be converted into any of the other supported schema languages, except that W3C XML Schema is supported for output only, not for input.

Trang can also infer a schema from one or more example XML documents.

***

It's in Java, but really useful to come up with Schemas very quick.

David, 27.07.2005 14:40 CET

It would be handy if one could pass a file handle/stream resource to simplexml_load_file() and/or DOMDocument::load(). The particular thing I'd like to do is open a remote URL, examine the response headers (e.g. Content-Length) and then conditionally pass the file handle to one of the XML parsing functions to load the data from the file handle. That way, there's no intermediate string holding the whole document and eating up memory.

Alan Knowles, 27.07.2005 15:44 CET

Doing some (not all of the simplexml ideas would be nice)
//shortcut for get/setAttribute
$node->attributes['some:attribute'] = "fred";
// shortcut for getElementsByTagName()
$tds = $node->children[''];
// shortcut for getElementById()
$fred = $node->children['fred'];

using array accessors for append/remove..
$node->children[] = new DomElement(....)
// deleting all..
$node->children = array();

probably nends more thought.. but would make code a bit tighter, without loosing the clarity.

chregu, 27.07.2005 15:55 CET

David: You can already provide user-land php streams to load/save (and everything else file-related in all xml functions). Meaning something like
$dom->load("mystream://foo/bar.html");
will work.

Alan: I hate too much magic :) Especially the append/delete methods smell like "too much magic"..

And
$node->attributes['some:attribute']
isn't much shorter than
$node->getAttribute('attribute')

(2 chars to be precise). the other 2? well, why not, except that getElementById is a problem case by itself (outside html...). And don't get me started about SimpleXMLs handling of namespaces :)

I can't implement that anyway, don't know enough about ZE2 inner workings...

thies, 27.07.2005 15:57 CET

the's one problem with the xml-stuff in PHP - i don't want UTF8!

my db character set is 8859, my output is 8859 _and_ my XML-files are 8859, but simplexml always returns UTF8. that is zero pain for all the 7Bit people as UTF8 and ASCII are the same for plain text, but it's not true for us poor german souls:
<?php
$x = '<?xml version="1.0" encoding="ISO-8859-1"?>Hülfe, ich wöll kein UTFß';
$a = simplexml_load_string($x);
print_r($a);
?>

i want a way to tell simplexml to return 8859 instead of UTF8. with the "old" xml-extension you can say:

$p = xml_parser_create('ISO-8859-1'); and the parser will return 8859, this is not possible with simplexml. so if my app runs in 8859 i cannot use the sexy simplexml interface - that sucks!

dewaard, 27.07.2005 16:02 CET

Porting JDOM (http://www.jdom.org) would be nice ;)

chregu, 27.07.2005 16:03 CET

thies: in PHP 5.5/6 everything will be UTF-16 anyway :)

Until then, you have to iconv your output.

We discussed this (if we should allow to define an output encoding for all functions, not just for save()) and decided not to do it for the time being. Maybe something for the unicode-php, 'cause we have to change stuff regarding output encoding anyway there.

chregu, 27.07.2005 16:08 CET

dewaard: Quick look on their webpage. It's something like SimpleXML with write support, right? Would be a nice addition, yes :)

thies, 27.07.2005 18:02 CET

i don't use unicode, do you?

so - to cut a long sentence short: don't use the new xml-stuff until you can affort port you app to be unicode-safe.

rasmus once told me on irc:
12:29 < Rasmus> if you don't like 7-bits, we are going to make it hurt

can you feel my pain? and i suspect that 5.5/6.0 won't cure me..

chregu, 27.07.2005 18:06 CET

thies: Yes, I use unicode/utf-8 throughout my applications, so it doesn't hurt me too much right now.
But 5.5/6.0 will hurt a lot of people. It's not even utf-8 (for performance reasons).

Michael Rolli, 28.07.2005 21:17 CET

I actually don't miss anything so far. Happy with your work! ;-) Thanks for it BTW!!!
What I'm really missing is libxslt2 featuring XSLT 2.0. There some marvelous things in it I can't live anymore without. In a new project I used it (oxygen with Saxon 8B).
Especially the new date formating possibilites help a lot (not to use PHP inside xslts) ;-) Besides I use <xslt:result-document> in that project.
I think, XSLT 2.0 is very well thought towards the needs people encountered using XSLT 1.0. And that's great.

Daniel, 18.08.2005 12:03 CET

The libxml-/libxslt-libraries need an effort to get full xml-schema support finished and later to implement new standards like xslt2.0, xpath2.0, ... and they need to be updated regularly in den php5-distribution.

chregu, 18.08.2005 12:08 CET

Daniel: You should say that to the libxml2 maintainers. Nothing we can do about that. But AFAIK the chances are pretty slim, that they will include xslt2 support soon.

And we don't include the libxml2 libraries, we just take the ones installed on the system. So it's the task of the sys-admin to have the latest libxml2 libraries installed, not ours :)

matt, 05.10.2005 20:36 CET

Wrt XML Schema support, a way to have the validator insert missing default values (attributes or elements) would be useful (the way you can with DTD's using DTDATTR)..

KnisterPeter, 07.12.2005 20:54 CET

I really miss SAX support for XSLT in PHP5.
With Sablotron one could do so, but now only DOM is available for XSLT which is really memory consuming.

chregu, 07.12.2005 21:46 CET

Sablotron had SAX support? That's completely new to me. How did that work?

AFAIK, sablotron also only just builds an internal DOM tree (not based on libxml2, but still a DOM tree). But if it really supported it, I'm more than eager to learn about it.

And did you measure memory consumption? Is it really higher than with Sablotron?

But besides all that, it's libxslt's fault anyway :) We're just building the interface on top of it.

KnisterPeter, 07.12.2005 21:51 CET

As documented in the php manual here http://de.php.net/manual/de/function.xslt-set-sax-handler.php one could set sax handlers for a xslt transformation.

I don't have looked at the real memory consumption, because this is not really what matters to me (ok, it does, if I'm honest :))
But the most important part for me is that it does give a really nice interface for transformation (I have something like cocoon in mind which does all work with sax).

Doesn't libxslt support sax handlers?

KnisterPeter, 07.12.2005 21:54 CET

Just to add this one for the above comment:
The point in memory consumption of the used library is a bit in the background, because if I'm forced to use DOM it doesn't matter if the used library improve in memory handling, because I have to load the whole document.
With SAX I get the improvement when the library implements it. If the library does DOM intermediate then the library is not perfect, but it would be possible to gain the plus if get's better. :)

See what I mean?

chregu, 07.12.2005 22:02 CET

If I'm understanding it correctly, Sablotron just called sax handlers on the output, you still had to get the whole document into memory on the input. And I'm almost pretty sure, that Saboltron also builds a whole document tree on the input side. So no real memory gain there. But i'm not really sure, what exactly it does on the output side... Would have to look at the sources.

Libxslt is based on libxml2, so libxslt will never improve a lot in memory consumption as it heavily relies on libxml2 and its dom-tree..

By the way, my benchmarks back then showed that libxslt is approx. twice as fast as sablotron (that was maybe 2 years ago). But I didn't test the memory consumption, it doesn't matter anyway unless your documents are really large (which mine usually are not, meaning less than 1 MB)

And last but not least: You heard about popoon? :) (it's not sax, but dom based, but anyway ...)

KnisterPeter, 07.12.2005 22:08 CET

I guess you are right about libxslt and libxml2. Anyway I have to build a wrapper around the XsltProcessor than to have a SAX interface :)
This would be a bit slower than relying on a C implementation but should work.

I've heard about popoon and have already checkout the sources. :) It's nice, but not cocoon compatible. I want a system that ready an unmodified cocoon sitemap (forrest or lenya sitemap) and builds up that page.
But I don't want to use java but php for simpler prototyping. If my system is compatible to the java stuff I have a really good prototype engine. Currently that' s the whole reason.

Roger Espinosa, 02.03.2006 19:50 CET

Is there any work being done to support the xsltproc "do XInclude processing on document intput" feature? The include options seem directed toward the DOM handling, but not the XSLT Processor itself...

Mike, 30.05.2006 07:34 CET

whats about some xml-diff?
if not, any solutions out there that work with PHP?

Lukas, 20.07.2006 10:08 CET

I think very useful would be support for Xml Schema manipulation. I think about something what can have smilar functionality to classes which are available on .NET platform in System.Xml.Schema namespace.

kay, 31.10.2006 23:13 CET

Hi all,
one big question to you specialists on the list:

i often have the following case within xsl-transformation:

within an xml file ("an" here because of eXtensible ?-)) i want to conditional reference an "outside value" within xsl template traversal:

<fielddef name="assetid" access="readonly"/>


<field name="assetid" value="something protected"/>




then i need to do xsl-traversal the records:
<...>
<xsl:variable name="name" select="@name"/>
<xsl:if test="/app/fielddef[@name]='$name'>....
<...>

and this does NOT work (compile error)
xpath-exp cant be based on "variable strings".
it has to be a predefined constant.
That is unterribly annoying, as the only solution i found is doing this with a 2pass xsl:
i create a temp xsl via xsl transformation - the complete output document which has additional <xsl:...> with "constant" xsl-expressions from the output of the "variable" xsl output from 1st pass.
then i run the 2nd pass with the result of pass1 against the original xml data.

you get headache during troubleshooting ;-)


even more its all more complex as i do xslt also on the ajax client side, currently with sarissa xml/xsl.

So guys, do you know another way to use indirect xpath expressions within xslt?

thank you so much!
Kay - germany

Rhett, 21.05.2008 13:38 CET

simpleXML needs removeChild(). It's a pain to have to switch to DOM and add [4] lines of code to make that happen.

Add a comment

Your email adress will never be published. Comment spam will be deleted!