Adobe CQ5 Developer Training

I just spent the past week in a developer training course for Adobe Communiqué 5.4 – a content management system on steroids. I thought I’d jot down some of my thoughts while they’re fresh in my mind.

CQ5 is a Java based CMS that is built around the JCR-283 (Java Content Repository) spec which essentially defines a sophisticated object database that is indexed by Lucene for easy searching and cross-referencing of objects. CQ5’s JCR implementation is called CRX, but there is also an open source reference implementation named Apache Jackrabbit if you have an allergy to commercial software.

It is not entirely correct to call the JCR an object database as it isn’t used to store Java objects directly – but the fact that it defines a tree of nodes and that all content is stored and accessed in a hierarchical fashion makes its use very similar to that of an object database. As such, it is natural to draw comparisons with Zope and its object database, the ZODB.

JCR vs ZODB

Zope, a python-based application framework, is radically different than the traditional relationship database model of web application development. The ability to store Python objects directly in the database and have them indexed solved many development problems, but it also created a few problems that would make maintenance of an ever-changing web application more difficult. Namely:

  1. When you make changes to a class, it can break all of the existing objects of that class in the database (you need to run a migration).
  2. If you try to load an object whose class definition can’t be found the system barfs.

This problem of class versions, managing upgrades of content types etc.. , was the single biggest problem with devleoping on Zope – and while I’m sure that there are best practices to work around this problem, I believe that the JCR solution of of storing content nodes but not actual objects is a much cleaner way of handling content.

The JCR stores a tree of content nodes, each of which have properties and their own child nodes. These structures translate well to different formats like XML (so you can dump entire branches of the repository as XML) and JSON – not so with a pure object database like the ZODB whose structures can be far more complex and include dependencies to classes. Data in the JCR can always be browsed independent of the component libraries which may be loaded into the system. You can browse the repository using WebDAV, the web-based content explorer that is built into CRX (the JCR implementation that is packaged with CQ5), or using CRXDE (the Eclipse-based development environment that is freely available to developers).

You can still define custom node types for your repository but this would merely dictate the name of the node type and perhaps which properties are required.

So, at first glance, this seems like a very stable base upon which to build web applications.

The Stack

The CQ5 stack looks like this:

  • WCM – The web content management layer consisting of a bunch of flashy UI components built using the ExtJS javascript library. (this part is proprietary).
  • Sling – HTTP server that makes it easy to read and write from the repository using HTTP requests. Very slick (this part is open source).
  • CRX – The content repository itself. Handles all permissions, storage, replication, etc… This part is proprietary. It performs the same function as Apache Jackrabbit, but includes a number of enterprise level improvements including a more powerful security model (I am told).

Author & Publish Deployment Instances

The recommended deployment is to have separate author and publish environments each running their own stack, and use the built-in replication feature to propagate authors’ changes to the publish instance whenever a piece of content is activated. This functionality, luckily, has been streamlined to hide most of the complexity. Workflow is built-in to allow you to activate each piece of content individually. Activation automatically triggers replication to the publish instance(s). This model seems to be very well suited to websites with few authors and many public viewers. It is scalable also, as you can add as many publish instances as you want to share the load.

This standard flow control (replicating changes from the author instance to the publish instances) leads me to wonder about cases where you do want the public to be able to interact with your site (e.g. through comments). We didn’t get into this scenario very much in the training, but, as I understand it, any content posted to the publish instance will go into an “outbox” for that instance that will be replicated to the author instances and await approval. They will then be re-replicated back to the publish instances once approved.

Security Model

The security model is quite different than that of most systems. Rather than having security attached to content types (because there are no content types) like with a relational database, or defining a large set of permissions corresponding to each possible action in the system as Zope does, security is 100% attached to the nodes themselves. Each node in the JCR includes an ACL (access control list) which maps only a small set of permissions to each user. There are only a few possible permissions that can be assigned or denied on each node. Basically it boils down to permission to read, write, delete, create new, set permissions, and get permissions on a node level. If there are no permissions assigned to a user on a particular node, then it will use permissions from the node’s parent.

One implication of this security model is that you must pay attention to the content hierarchy when developing applications. You cannot treat this like a relational database!

This is important. I suspect that many developers coming from a relational database background will be tempted to try merge the best of both worlds and try to create pseudo-content types in the system. After-all, all properties in the JCR are indexed, so you could easily just add a property called ‘contentType’ to your nodes to identify them as a particular content type, then build functionality that allows users to add instances of this content type. You could then create view templates that aggregate these content types to treat them as a table. You could do this, but you must be aware that you don’t have the same level of control that you have in a relational database system over what a user can do with your content types.

If you are querying the repository solely based on a property on a node – and not based on the path, then you may be surprised by the results that you obtain. At the very least, the JCR security model, despite appearing to be simple, is actually far more difficult to implement than its relational cousin – when trying to imitate the functionality of a relational database. You cannot control what properties are added to every node in the repository so querying based on property values may produce undesirable results. Instead you have to fully embrace the hierarchical model of data step very carefully when you try to import concepts from other paradigms as they could cause you to inadvertently introduce holes.

Custom Content Types (Sort of)

While CQ doesn’t have custom content types, it does allow you to map content nodes to a set of rendering scripts which produces something very much likc a content type. By setting the “sling:resourceType” property on a node to the path to a “component” that you develop, you can dictate where CQ looks for scripts that are used to render the node when requests are made. Components can be either “page” components, which represent an entire page, or regular components, which are included inside a page.

You can register page components to show up in the list of types of pages that can be added by authors when they add a new page to the system. Similarly you can register your regular components to show up in the “sidekick” (i.e. component palette) for authors when they are editing a page, so that it can be dragged onto a page. You can define which types of components are allowed to be parents or children of other components, and you can define which parts of site are allowed to have a particular component types added.

The Component Hierarchy

You can also define a “resourceSuperType” for components to allow them to inherit from other components in the system. This is handy for code reuse as there are hundreds or thousands of existing components that can be overridden or extended. We ran through several exercises creating and extending components. I’m satisfied that this process is not difficult and quite powerful.

Component Dialogs

A component without a dialog is really a lame duck. Users (especially authors) need to be able to interact with your components. E.g. if you create a photo album component, you need to allow your user to add photos to it. Adding dialogs is not difficult but I suspect that the development process is slated for improvements and more automation for future releases. The dialog forms are created entirely by creating appropriately named subtrees under your component’s node. E.g. you would create a child node of a particular type named “dialog”, which contains a child node named “items”, which contains a subnode named “tabs”, etc… 6 or 7 layers deep.

Each tab, each widget, each panel, is represented by a node in the repository. This is clever but somewhat tedious. It is like building a UI using only the UI hierarchy tree in the left panel of the IDE without the visual editor. I suspect that future versions will probably include a proper WYSIWYG UI editor for developing these dialogs but for now this manual system will have to do.

Despite the tediousness of the process, in the scheme of things it is still quite efficient. In only a few minutes you can produce a multi-tab, multi-field UI with rich widgets that allows your users to add and edit a myriad of content types on your site.

TestDisk a Nifty Utility for fixing drives with bad boot sectors

Just ran into an interesting problem with an external hard drive that was being used as a time machine backup for laptop. Someone tried to connect this drive to their windows machine and it evidently screwed up the boot bits so not only would windows not recognize it, Macs wouldn’t recognize the disk either.

Tried running it through Disk Utility but received a message saying “Disk cannot be repaired.”

So I loaded up TestDisk and took it for a spin. Here is a photo gallery outlining the steps that I took.

Left vs Right by way of a “Lion” analogy

Most political systems seek to work for the betterment of society. Most include a claim to a goal of protecting the weak.

As an analogy let’s consider how different political ideologies would handle the case of an endangered wild species like “a lion”. These wild animals are “the weak” in the sense that they are completely subject to the whim of man. They are unable to defend themselves in a world dominated by humans. Hence, they are weak, and as benevolent stewards of the world it is our noble responsibility to protect the individuals of this beautiful species. The question which the various ideologies differ on is “how”?

The communist would decide that in order to protect the lion is to provide for its every need. To guard against starvation, he will provide food for the lion served in a dish. To guard against the ravages of nature, he will provide shelter for the lion. No more will the lion want for food or fear the cold rain. Their every need will be provided for. And in order to protect the lion from his inability to know what’s best for him, he will enclose the shelter in a cage. That way the lion will be protected completely from enemies, nature, and himself.

For simplicity’s sake we’ll just say that the communist solution for protecting the lion is to place him in a zoo.

On the other end of the political spectrum we have the laissez faire capitalists. They are also concerned with protecting the lion, but they sneer at the communist solution, seeing that, while preserving the lion’s existence, it would strip the lion of everything it means to be a lion. Lion’s are hunters, and by God, they should be allowed to hunt for their food and provide their own shelter. In the capitalist’s mind, the best way to help the lion is to not help the lion. A lion who fends for himself and decides his own future will be a stronger, more successful lion – and will produce stronger, and more successful offspring.

A slight complication comes into play when lions get too strong and numerous. A weak, lone lion, when faced with human development in his back yard, will have no choice but to retreat to other uninhabited areas. A pride of strong, and successful lions, however may pose problems to their benevolent capitalist dictators, however, if they refuse to move – but rather use their instinctive predatory skills developed through the generations to defend their homes. When it comes to this, the capitalist must work swiftly to ensure the safety and security of the society as a whole. It becomes an imperative to constrain the lions. Relocation to a different area works only for a while, but these persistent lions keep on finding their way back to the contested territories.

The benevolent capitalist knows that the lion who stays near a human settlement is a danger to the people – and will ultimately end up being shot. So he enacts policies to protect both the lion and the people. Any lion found near a settlement will be tranquilized, tagged, and sent to a safe place where they can no longer trouble humans.

This policy works for a while, but as settlements grow, encounters with lions become more frequent – and even inevitable. In order to avert any future confrontations with lions, the only logical solution is to seek out all lions – even the ones residing far from the human settlements – and relocate them also into captivity. This preemptive doctrine makes sense to the prominent thinkers of society as it seems to address all of the security needs of society. But it creates a logistical problem since there aren’t enough cages built to house all of the lions. What’s more, it turns out that they don’t have the resources to feed all of the lions once captured.

In the face of a potential massacre, some of the leaders go to the people and request a small tax increase in order to pay for the care and handling of the lions. The request makes some headway until it is pointed out that taking care of all of the lions in zoos is exactly what the communists and socialists do with their lions. The mere thought of sharing any characteristics with a communist regime is too much for this right-wing society to handle so the proposal is rejected.

But the problem still remains: what to do with all of these lions. They don’t have the resources to cage them, and since the mandate of this society fancies itself as a defender of the weak, a massacre is out of the question. So they decide to relocate all of the lions into a desolate island in the north. Due to the climate and a previous nuclear disaster, this island is undesirable to the humans of society. This coupled with the fact that lions can’t swim would ensure that lions would pose no more threat to humans.

This solution has the virtue of following good, old-fashioned conservative principles. The future of the lions is left in their own hands. The strong will survive and the weak will abide by the blessed laws of natural selection and cease to survive of their own free will. Society is able to move on in peace and security.

However, when it is reported 10 years later that the population of lions has dwindled to endangered levels due to lack of food supply, inhospitable conditions, and disease caused by the contaminated terrain, society decides to act fast to try to preserve the species. So they send an expedition to the island of Elba to collect a sampling of lions and transport them to a safe place, known as a zoo. There their needs are met for the purpose of preserving their existence for future generations to enjoy.

So, for simplicity’s sake we’ll just say that the capitalist solution for protecting the lion is to place him in a zoo.