Dispatcher


Performance is the major consideration while developing websites. For this when we develop with AEM,  Adobe has implemented a magic tool nothing but Dispatcher, which is also known as caching and/or load balancing tool. When you wish to cache the content of site Dispatcher comes into picture. It helps while accessing the engine as little as possible when a url hits multiple time which is cacheable. Using the Dispatcher also helps protect your application server from attack. The Dispatcher helps realize the environment that is both fast and dynamic.

Dispatcher role in AEM or its anatomy

The Dispatcher contains mechanisms to generate, and update, static HTML based on the content of the dynamic site. You can specify in detail which documents are stored as static files and which are always generated dynamically.

A good practice is to think about the Dispatcher and caching right from the start of your project. Make it an integral part of your application and content architecture.

Configuring the Cache – dispatcher.any

The whole dispatcher.any file consists of properties and their values (which are multi valued). All the properties start with a forward slash “/” . Values are enclosed with two braces “{}” . The “name” property indicates the name of dispatcher. Dispatcher contains “farms” which holds a bunch of properties and values.

By default the Dispatcher configuration is stored in dispatcher.any, though you can change the name and location of this file during installation.

What to Cache – the Rules Section

It is in the /rules section of the dispatcher.any file that you specify which documents are cached. If you do not have dynamic pages (beyond those already excluded by the Dispatcher’s own rules), you can let the Dispatcher cache everything.

By default the following requests are not cached by the Dispatcher:

  • Requests that do not return http code 200
  • requests with suffixes
  • requests with request parameters (i.e. “?”)
  • programmatically: send http header
  • response.setHeader(“Dispatcher”, “no-cache”);

Adobe recommend to use multiple farms to put dispatcher configurations rather than single farm  . In that case , you might want to divide the logic/functionality for different sets of urls/websites into multiple farms and include them in the parent farm.

The /farms property can contain one farm (if you want to handle all websites/urls in the same manner) or multiple farms (when you want to define different sets of handlers/farm for different sets of websites/urls). Inside the/farms, you can define a farm or you can include a farm defined somewhere else or you can do both.

/farms this is the value of name, you can define any name
{

    /techroomweb
    {
   
### This is the first farm which has been defined in the dispatcher.any itself .
    }

    $include (“techroomweb.any”) # This is the second farm which has been defined in the techroomweb.any file .

Note: the farms are evaluated from bottom to top

Most of the times “/clientheaders” is the first property in a farm configuration. Each HTTP request carries a set of Request Headers . They are pretty much needed for your application to decide some most crucial attributes of the request that is coming to your website.

clientheaders{
      “referer”
      “user-agent”
      “authorization”
      “cq-action”
      “cq-handle”
      “handle”
       ….
       ….
       ….
       ….
     }

The /virtualhosts property defines a list of all hostname/URI combinations that Dispatcher accepts for this farm.

/virtualhosts
{

# “*” will cause all the requests handled by this dispatcher.
“*”
}

You may have two farms to handle separate kinds of requests.

The /renders property defines which URL the dispatcher sends requests to. In 99% of the use cases , this is the IP of the Publish CQ instance. You can have multiples renders for a farm , which will distribute the load to all those ip/CQ/AEM instances , mentioned in the renders property .

/timeout is the time in milliseconds the dispatcher should wait the AEM instance to respond.

/renders

{

/render01

{

/hostname : “localhost”

/port :”4503″

/timeout:”10000″

}

Denying Access – The Filter Section

Usually, dispatcher is also used to restrict external access to resources

you need to be aware of this when coding your application. Using filters, you can specify which requests are accepted by the Dispatcher module. All other requests are sent back to the server, where they are offered to the other modules that run on the web server.

# the glob pattern is matched against the first request line

/filter

{

# deny everything and allow specific entries

/0001 { /type “deny” /glob “*” }

# open consoles

# /0011 { /type “allow” /glob “* /admin/*” } # allow servlet engine admin

# /0012 { /type “allow” /glob “* /crx/*” } # allow content repository

# /0013 { /type “allow” /glob “* /system/*” } # allow OSGi console

# allow non-public content directories

# /0021 { /type “allow” /glob “* /apps/*” } # allow apps access

# /0022 { /type “allow” /glob “* /bin/*” }

/0023 { /type “allow” /glob “* /content*” } # disable this rule to allow mapped content only

# /0024 { /type “allow” /glob “* /libs/*” }

# /0025 { /type “allow” /glob “* /home/*” }

# /0026 { /type “allow” /glob “* /tmp/*” }

# /0027 { /type “allow” /glob “* /var/*” }

/cache property and its values define the way your dispatcher caches documents/pages . It has multiple sub properties and values .

/cache

{

# Cache configuration

/rules

{

/0000

{

/glob “*”

/type “allow”

}

}

/invalidate

{

/0000

{

/glob “*”

/type “deny”

}

/0001

{

/glob “*.html”

/type “allow”

}

}

/cache
{

/docroot ## Defines the place , where your cached files should be . The value should be relative to the docroot of the webserver.
/statfileslevel  ## Sets the level upto which files named “.stat” will be created in the document root of the webserver.
/serveStaleOnError
/allowAuthorized  ## setting the value to 1 enables the dispatcher to cache authenicated documents to get cached in the dispatcher.
/rules  ## It provides the documents which should be cached .
/invalidate
/invalidateHandler
/allowedClients
/ignoreUrlParams

}

The /invalidate section defines a list of all documents that are automatically rendered invalid after any content update.

/invalidate property defines what are the cached contents that will be invalidated whenever there is a content activation.

/invalidate
{
/
0000
{
/glob “*”
/
type “deny”
}
/0001
{
# Consider all HTML files stale after an activation.
/glob “*.html”
/type “allow”
}
/0002
{
/glob “*/content/fadfish/products*
/type “
deny
}

}

Cache Invalidation (Expiration)

Cache invalidation needs to be triggered by author or publish (usually as a result of an activation). The Dispatcher is told by the Dispatcher Flush Agent to invalidate the cache. The Dispatcher then touches the .stat file (does not remove content file), creating a timestamp against which new document requests will be checked.

Cache invalidation is hierarchical! Typically a level in the cache is chosen and all documents below that level are invalidated.

Additional Dispatcher Performance Tips

In general, only 10% of total document request time is spent on server, the rest is transfer and client-side time. To improve performance:

  • reduce number of total requests
  • enable gzipping on web server
  • use the HTML Library Manager to zip, concatenate and minify JS and css

It’s a very delicate part while developing websites to maintain performance on the web. so always keep an eagle eye on Dispatcher.

Thank you

@techroomweb.

Leave a comment

search previous next tag category expand menu location phone mail time cart zoom edit close