Blog Posts

Content storage done right

Jackalope and PHPCR have been a reoccuring topic on this blog. Back in 2009 we here at Liip began exploring the possibility of integrating Jackarabbit, the reference implementation of the Java Content Repository specification, with PHP. The vision was two fold: First up we wanted to make it possible to directly interact with content stored in AdobeCQ (called Day Communiqué at the time) or Magnolia. Additionally we also felt it would be a great asset to the PHP CMS world to be able to leverage all the power of JCR from PHP, hence PHPCR. The initial attempts made use of the Zend Java Bridge to communicate directly from PHP to Java. However eventually we realized that it would be more feasible to use the native HTTP API provided by Jackrabbit. But things only really took off when the Symfony CMF initiative decided to adopt our work. Now four years later we finally have the first stable releases of PHPCR, Jackalope and the hibernate inspired object mapper PHPCR ODM.

It is PHP

The gut feeling of many PHP developers when faced with Java is usually one of worries of factory factories, endless XML configuration and deep class structures. This sentiment might best be illustrated by this famous saying: Java is a DSL for taking large XML files and converting them to stack traces. At the same time perceptions have changed. Lucene is the basis for Solr and ElasticSearch, the go to full text search engine for most PHP developers. In fact pretty much all PHP CMS defer to either of these solutions when dealing with larger data sets when it comes to search. By the way, Lucene is integrated into Jackrabbit right out of the box. Furthermore, many PHP developers have realized that there is in fact value in decoupled architectures and design patterns which indeed tend to result in more complex class structures. The benefits are however reuseability, testability and the fact that each unit of code is much more approachable on its own. But of course this does not mean that there is no value in the PHP platform aside from its ubiquity resulting in the fact that in October 2013 PHP was used on 80% of all domains. The conclusion is that there is no need to jump ship from PHP but there is value in bringing outside ideas to the platform. Obviously porting JCR to PHPCR cannot be done as a one to one mapping. So PHPCR leverages the fact that PHP provides associatives arrays where JCR has to rely on more unwiedly object structures. And most importantly for making PHPCR relevant for the real world, there is also an implementation that is written purely in PHP connecting to a RDBMS using Doctrine DBAL, ie. its possible to use PHPCR without running any Java at all.

It is community

Through out this entire effort Liip has done a significant portion of the work. However there has always been people from the community involved as well and this was the goal from the very start. At this point one can therefore with confidence state this this is a community effort and that development is no longer driven entirely by Liip. This includes reporting bugs but also fixing bugs, improving or adding features, tests and of course also documentation. I want to briefly highlight some key contributors:

  • Karsten for his initial work on the PHPCR interfaces which we adopted
  • Uwe and Johannes for becoming the first non Liipers to make significant improvements to Jackalope and PHCR ODM
  • Benjamin for laying the groundwork for Jackalope Doctrine DBAL
  • Dan for his work on the PHPCR ODM Query Builder

What is next?

This release of PHPCR provides compatibility with JCR 2.1. Jackalope, the reference implementation of PHPCR, integrates with Jackrabbit and Doctrine DBAL. The next steps consist of completing some of the optional features and further performance improvements. For example it would be interesting to make it possible to be able to use Solr/ElasticSearch in combination with the Doctrine DBAL implementation. Another feature we are looking forward to is improved logging and caching capabilities. We are also looking forward to work picking up again on the MongoDB implementation. We are also keeping an eye on the next major version of Jackrabbit, code name Oak. In fact we have already tested compatibility with the current releases together with the Adobe engineers. But generally we are most looking forward to people to add PHPCR to their applications where ever they feel they can benefit from a storage solution that provides unstructured content in a tree structure with support for node types, binaries, versioning and full text search. Case in point we are looking forward to the imminent release of Symfony CMF.

Related Entries:
- prismic.io content repository as a service
- PHP family meeting at FrOSCon
- Jackalope and Magnolia CMS: Recording online, questions and answers
- Announcement: PHPCR and Magnolia CMS: Bridging the PHP and Java Worlds
- Jackrabbit and its two SQL languages - some findings

About the author

Comments [4]

Hannes Gassert, 12.10.2013 09:52 CEST

Congratulations! Looking forward to seeing this having an impact in out there in the market!

Bruce Weirdan, 14.10.2013 03:12 CEST

Is there any benchmarks available for PHPCR with either DBAL or Jackalope backends?

Lukas, 14.10.2013 08:17 CEST

No unfortunately there haven't been any benchmarks in a while so there is no data is any relevance available. In generally I would assume that lookups by path and uuid should perform quite well with doctrine dbal and jackrabbit however doctrine dbal will scale miserably for search queries. I guess upto 100 documents it should be ok, maybe even 1000. Doubt that the performance would be acceptable beyond that. For Jackrabbit you can find more detailed analysis of the search performance here http://blog.liip.ch/archive/2012/06/26/jackrabbit-and-its-two-sql-languages-some-findings.html. There is also some more information here http://wiki.apache.org/jackrabbit/Performance. Note especially the 10k child node performance issue, with PHPCR I would try to keep things even smaller, lets say 1k children, because on a node read we read the entire list of children node names.

A detailed benchmarking would be great though!

Hyh, 06.12.2013 11:42 CEST

So I'll be waiting on for some new information.

Add a comment

Your email adress will never be published. Comment spam will be deleted!