Blog Posts

Jackalope observation and import repository dumps

Daniel Barsotti and me were reviewing the observation API of PHPCR and decided to just implement getting the observation journal. The journal contains all add, remove and update operations that happened on a PHPCR repository. You can also filter the journal by event type, path, node type and other criteria. This way, PHPCR can become almost a message queue (but just almost, there is no guaranteed delivery of messages).

use PHPCR\Observation\EventInterface; // Contains the constants for event types

// Get the observation manager
$workspace = $session->getWorkspace();
$observationManager = $workspace->getObservationManager();

// Get the unfiltered event journal and go through its content
$journal = $observationManager->getEventJournal();
$journal->skipTo(strtotime('-1 day'); // Skip all the events prior to yesterday
foreach ($journal as $event) {
    // Do something with $event (it's a Jackalope\Observation\Event instance)
    echo $event->getType() . ' - ' . $event->getPath()
}

// You can filter the event journal on several criteria.
// here we are only interested in events for node and properties added
$journal = $observationManager->getEventJournal(
    EventInterface::NODE_ADDED | EventInterface::PROPERTY_ADDED);

foreach ($journal as $event) {
    $event = $journal->current();
    // Do something with $event
    echo $event->getType() . ' - ' . $event->getPath()
}

The PHPCR / JCR standard also defines how to attach event listeners. The problem with this is that for any situation with concurrent repository access, the implementation needs to poll for events to trigger the listener. We assume that the JCR use case was a long running Java application that updates some local state based on polling events in a separate thread. For PHP, you neither have long running processes nor a thread system where you could do the polling. We decided that the journal will cover the most important use case: A cronjob that looks for specific events to act upon, instead of searching the whole repository each time and needing to determine if there is something changed that it needs to act upon.

Additionally, we implemented Session::importXML to import XML data into the repository. You can import both the JCR system view documents that are an exact dump of the repository and general XML documents where element names will be translated to PHPCR nodes and attributes to PHPCR properties.

About the author


Find more about him on Twitter, Google+ and his personal site.

Comments [2]

Lukas, 28.03.2012 12:13 CET

Yeah for the listener thing we would likely want to look towards some message queue. Not sure if there is a decent library that abstracts this .. but I fear it might be on us to define the necessary interfaces etc .. for most users I would expect ZeroMQ will likely be the tool of choice, but I wouldn't want to force that on people.

Maybe Alvaro, our resident MQ guru, can give us some hints here.

david, 28.03.2012 12:23 CET

for normal php applications, i don't see the use case of the non-listener model. unless you plan to do long-running php processes and have a threaded php engine with server sockets, it will be polling in some sort or another, whatever you do.

and when you write a separate application (symfony command for example) then you can just as well loop through the journal and dispatch instead of attach listeners.

Add a comment

Your email adress will never be published. Comment spam will be deleted!