A Year in the Desert of AEM: An Introduction

Let me start with a brief introduction both of myself and the purpose of this series. When I started my new position as a burgeoning Systems Administrator, I was told I would be working with Adobe Experience Manager (AEM, then called CQ), a product I had never heard of. I had to use Amazon Web Services (AWS), a cloud platform I had never worked with before. To say I felt enormous pressure to grow and perform quickly would be an understatement. The vast AEM environment felt nothing short of a desert wasteland, and I was being called to navigate. On top of this, there was quite a bit of standardization and automation that needed to be hashed out.

Thankfully I wasn’t completely alone; I had a new coworker who had been stranded in this wasteland in much the same way I was. There was also a team lead who was eternally buried behind the workload that my coworker and I had come to help tackle. I first started conceptualizing this series of articles after my first year working with AEM, thinking to myself, “What are some of the things I wish I had known from the start?”

That said, the purpose behind this series of articles is as an introduction to the administrative side of AEM. I will also impart some of the information I have picked up along the way. I will be splitting the remaining articles up as follows; Author, Publish, Dispatcher. Each of these articles will cover different pieces of information regarding the AEM stack; but before we venture too far from civilization as you know it, we need to prepare for our journey by covering some basics.


You see, I realized as I started writing up information about the different environments that there is quite a bit of other introductory information on AEM in general that might be helpful to point out. For example, sometimes you may read/hear references made about the Author environment; this can get a little confusing since we typically think of our environments in terms of “production”, “pre-production”, “development”, etc. “Author” is just one specific responsibility an instance can be assigned in AEM. So we will put a pin in the environments/responsibilities and come back to those later.

The Desert of AEM
The AEM landscape is nothing short of extensive. Like a desert, which is enormous in size and difficult to navigate, this robust content management system with integrations to the rest of the Adobe Marketing Cloud takes time and research to properly cross. AEM is built around a few open source projects with some proprietary code to hold it all together. We at Axis41 have found that AEM, due to its size and complexity, is many different things to different audiences. When looking at it from the SysOps point of view, we usually focus on two specific conceptual models: how it is delivered as a piece of software, and how it executes as a platform.

Let’s dive into the technology a little bit more. AEM has several layers for its Software Delivery Model, like an onion; at its core are the implementations of the OSGi (Apache Felix) and the JCR (Apache Jackrabbit). These are then connected with Apache Sling, which presents some added features—primarily a RESTful Framework, scripting engines, and an authentication layer. This is then wrapped up with Adobe’s CRX, which adds some proprietary features such as the Author/Publish model, replication support, and tools for management and development, just to name a few. The next layer is then the actual Adobe Experience Manager; this brings with it additional management tools for Campaigns, Websites, Assets, Tags, and Workflows.

Moving on we will take a high-level look at the execution model for AEM. The OS executes AEM as a Java application that will run in one of two modes, Author or Publish. Author is the main interface that content creators and developers will interact with. Publish is the “work-camel” of the AEM suite; it takes all the dynamic content and generates static data ready to be delivered to browsers. As AEM is a Java application, it also kicks off the JVM with the flags you have defined. The JVM executes Felix (OSGi), which in turn executes Jackrabbit (JCR), Sling, and the WCM.

Something I have noticed, especially with production environments, is that AEM tends to struggle if you allocate less than 2GB of memory to the JVM. Also, the JDK and Java version you use matters. For AEM versions earlier than 6.0 SP2, you will not be able to use Java 8, and OpenJDK is unsupported.

Planning Your AEM Expedition
Before undertaking your desert excursion, you ought to be certain that everything is in order, like the amount of food and water that you have and the location of various checkpoints along your way. You also need to know how AEM communicates to the other servers in the stack ahead of time. It is important to note that the information I am providing is the default and recommended settings. These settings can be changed, but you really should have a very good reason before you do so, if for nothing else to avoid confusion later on. The image below shows the various security zones and the ports AEM uses, which will need to be whitelisted.


In a standard setup, you would have your content creators accessing the Author over port 4502 from an internal zone (or possibly through an Author-facing Dispatcher). Then you would have a zone for the Publish, which would be opened on port 4503 to communication from the Author and Dispatcher. Lastly, you would have a zone for the Dispatcher that allows communication on port 80/443 from the Author, Publish, and either the outside world or in a high-availability case from the Load Balancer.

Some additional information that you may want to be aware of is that the Publish server has a passive role—meaning it does not request information from either the Author or the Dispatcher. The Author will either push changes to or pull updates from the Publish. The Dispatcher either requests data from or sends data to the Publish. The only outside communication the Publish initiates is a simple request letting the Dispatcher know its cached content is out of date. So the Publish is really just a go between pulling double duty; first as rendering host, and secondly as a content buffer allowing content creators to make changes without them being viewed by the general public.

The AEM Caravan
Now that your travels have been mapped out, it’s important to know what your days will look like. Similarly, it’s imperative to understand how normal AEM operations work from day to day. Although this next part might sound like it’s straight from Abbott and Costello’s “Who’s on First” skit, it will give a good overview to use as a reference later on.

A content author signs into the Author instance and makes updates to a site page. This is then activated and can trigger a workflow for the content to be reviewed by another party. Once the new content is approved, it is pushed out to the Publish instances in the stack. These Publish instances, after ingesting the changes, then send an invalidation request to the Dispatchers letting them know the page has been updated. Lastly, an end user sends a request through the Load Balancer to view the page that has been changed. The Dispatcher recognizes that the requested page was invalidated and checks for a new version from the Publish. The Dispatcher caches the updated page and serves the request back to the end user. The image below may help visualize this process.


With this information out of the way, you should be somewhat versed in the basics, which will help with further discussions of AEM. Whereas none of this is extremely complex, this information will better prepare you for the journey we will be taking together. You’re now armed and ready with what you need to begin the AEM journey as we venture into the great expanse and take a look at the various roles that exist within AEM. Our first stop is the Author.

If you have questions or comments feel free to email [email protected].