Cache Design in Drupal

for Developers

Jimmy Huang at
PHPConf 2013, Taipei, Taiwan

Why Drupal?

  • Drupal is slow without Cache

  • Drupal's flexibility heavily depends on cache design

  • I love Drupal - already fall in love over 8 years

  • Many developers verify these design

Without Cache  → Cached

        350ms → 170ms

Drupal 7 clean install

Without Cache  → Cached

5,977ms  →  627 ms

  • Over 10 custom fields
  • Custom Categories, Tag
  • Maps, Location info
  • Author Statistics
  • Related Articles

Best open source choice of my life

- Jimmy Huang (2006)

Drupalcamp Taipei (2012)

Founder of Drupal "Dries" on the screen

Drupalcamp Hackthon 2013

for open street map tw / ubuntu taiwan

Become "Real" Drupaler


Multi-Language Sites


One-to-many sub site

Forum / Community Website

EC/Shopping Cart

Stackoverflow like Website

News Portal / Videos Portal

Enterprise Information Portal

Crown Founding(Kickstarter Like)

Food traceability system

CRM / Event Register (With CiviCRM)

Congress data info

Mobile Content Backend


Most Important: Developers

2001 ~ 2013 commit log visualization of Drupal

If WordPress is what Web designers choose, 
Drupal is what Web Developers choose.
- Andrew Oliver    


Views make me lazy

Content Type make me boring

Flexibility perfect but...

Slow... without tuning / caching

Drupal here!!

Everything  Cached
in modern web app

Cache design highlight in Drupal

  • Consider Static Resources
    • CSS, Javascript, Images, CDN support
    • Static HTML (3-party)

  • Different Bin  for different usage or module
    • Core: Page cache, Config cache, object cache, routing ...
    • Custom Module: Rendered content cache ...

  • Swappable Cache engine
    • Database (core)
    • Memcache, APC, File based, AWS (3-party)

    Static Resources

    304 not modify

    when visit Cached Page

    1. Tell browser take control of cache
    2. Calculate cache lifetime for browser
    • check cookie / session time
    • check "Max Age" settings in Drupal
  • Tell browser the actual "Max-Age"

  • Cache-Control

     Cache-Control: public, max-age=21600
    // If the client sent a session cookie, a cached copy will only be served
    // to that one particular client due to Vary: Cookie. Thus, do not set
    // max-age > 0, allowing the page to be cached by external proxies, when a
    // session cookie is present unless the Vary header has been replaced or
    // unset in hook_boot().
    $max_age = !isset($_COOKIE[session_name()]) || isset($hook_boot_headers['vary']) ? variable_get('page_cache_maximum_age', 0) : 0;
    $default_headers['Cache-Control'] = 'public, max-age=' . $max_age;

    Etag, Last-Modified, Expires

    • Always send expire header in early year
      • Let Etag and Last-Modified to take control expires

    • Assign cache created time to  Last-Modified
      • that the correct meaning of last-modified

    • Assign cache created time to Etag
      • Cheap unique id for this page
      • Sync with Last-Modified
    default_headers['Last-Modified'] = gmdate(DATE_RFC1123, $cache->created);

    $etag = '"' . $cache->created . '-' . intval($return_compressed) . '"';

    // HTTP/1.0 proxies does not support the Vary header, so prevent any caching
    // by sending an Expires date in the past. HTTP/1.1 clients ignores the
    // Expires header if a Cache-Control: max-age= directive is specified
    // 2616, section 14.9.3).$default_headers['Expires'] = 'Sun, 19 Nov 1978 05:00:00 GMT';
    see also: drupal_serve_page_from_cache

    CSS / JS aggregation

    cross modules

    JS / CSS in modules


    • Different modules have own css / js
    • Different modules in different directory
    • Some module use jQuery plugin, some not
    • Themes has own css, javascript
    • IE have 30 css file limitation


    • Put css / js into static array
    • Output when all the modules/themes done
    • Aggregate files to decrease front-end connection

    After Aggregation

    13 files → 4 files

    Add Resource / Build Cache

    drupal_add_js (50 calls)
    drupal_add_css (42 calls)


    Template just a single variable

     <html xmlns="" xml:lang="<?php print $language->language; ?>" version="XHTML+RDFa 1.0" dir="<?php print $language->dir; ?>"<?php print $rdf_namespaces; ?>>
    <head profile="<?php print $grddl_profile; ?>">
      <?php print $head; ?>
      <title><?php print $head_title; ?></title>
      <?php print $styles; ?>
      <?php print $scripts; ?>

    Image Cache

    Custom Image Size and Style

    Cache resized image based on layout

    "Cache" Image by style

    1. Check if resized image exists

    2. If not, generate image base on style

    3. Cache generated image

    4. Next time will serve static img directly

    Image cache: file test than passed to PHP

    Apache: mod_rewrite config

    # Pass all requests not referring directly to files 
    # in the filesystem to
    # index.php. Clean URLs are handled in
    # drupal_environment_initialize().RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^ index.php [L]

    Nginx: try_files config

    location ~ ^/sites/.*/files/styles/ {
    try_files $uri @rewrite; }location @rewrite { rewrite ^ /index.php; }

    Save request of PHP

    • 1,500ms → 600ms
    • Use keep-alive to serve multiple image in 1 http

    Static HTML Cache

    (third-party module)

    Poorman's high performance cache

    • File exists then serve cached html
      • Same as image cache
    • Anonymous visitor only
      • Drupal deliver whole page cache to anonymous only
    • Check cookie to detect anonymous in Web Server
      • Apache: 
        RewriteCond %{HTTP_COOKIE} SESS
      • Nginx
        map $http_cookie $no_cache {  default 0;
          ~SESS 1; # PHP session cookie

    See also:  Boost module

    Bootstrap cache
    settings, translations, modules, autoload ...

    Bootstrap in Drupal

    Drupal will bootstrap on every visit

           When visit a url "":
    • Page cache exists ? return cache
      • Load global settings (variable)
      • Load sessions, user object
      • Load translations
      • routing permission of this url ? return 403, 404
        • Load System Modules
        • Autoload prepared
          →  Execute page result

    Page Cache

    Save printed HTML to cache

    1. After Render all the element (drupal_deliver_html_page)
    2. Before the end, ob_get_clean to save all HTML
    3. Next visit, deliver whole page


    • deliver to Anonymous user only
    • support any type of cache  bin (varnish, memcached)
    • store gzipped version to save both CPU/bandwidth

    Performance compare sheet

    Cache Type 100 Page Avarage
    none 414ms per page
    Database (Drupal default) 53ms  per page
    Memcached 32ms  per page  
    Boost 0.264ms  per page  

    Tested in Linode 1024, Nginx + PHP 5.3


    Every module save config easily

    Save config -  variable_set

    $custom_var = array(
      'test1' => 1,
      'test2' => 2,
    );variable_set('mymodulename_custom_var', $custom_var);

    Retrive config - variable_get

    $var = variable_get('mymodulename_custom_var', array()); 

    500+ serialize blob record in DB

    All the settings save to database table.

    Cache all config to single record

    • Clear cache when save new config
    • Retrieve cache on every bootstrap

    With / without variable cache

    saved 30-50ms in large site

    see also: _drupal_bootstrap_variables


    5000+ strings saved in database

    • Every module can use t() to translate string
    • 1000+  calls of t() per page
    • Cache strings into 1 record, to save DB overhead
    • Use static array to save duplicate string loading
    • Cache clear when translate string




    Drupal is Module based system

    • 100~200 modules for modern website
    • 500+ calls per request - Lookup module frequently
    • Scan whole directory - io overhead

    Module List Cache

    • Cache refresh when enable / disable module
    • 1 record saved whole directory scan

    see also: system_list


    • Registry Class into database when module enabled
    • Autoload Class related file when exists
    • Save database class mapping into cache
    • Autoload list from cache without DB overhead

    see also: _registry_check_code_registry_update

    overhead for works together 

    • Modules work together by hooks even independently
    • Every hook call function_exists loops all modules
    • 200+ calls of module_implements, 100 module enable
      → loop 200*100 to check function_exists

    module "profile" modify form element made by module "user"
    function profile_form_alter(&$form, &$form_state, $form_id) {
      if (($form_id == 'user_register_form' || $form_id == 'user_profile_form')) {
        // modify form element here...

    Cache on Module hooks

    • Run all the loops first time
    • Cache by hook indexed array after page load
    • Next time, just loop cached hook
    • When invoke specific hook, only loop array[hook]
      • cached    : 1000+ function_exists calls (3ms)
      • no cache: 20000+ function_exists calls (70ms)

    see also: module_implementsmodule_implements_write_cache

    Content Cache


    Article field in clicks in Drupal

    but ... every field is a single table entity

    When visit an Article, we need to...

    • Join (1 + numer of field) tables - DB overhead
    • Parse text to specific format - PHP overhead
      • parse wiki syntax, bbcode, remove un-secure html ..
    • Prepare Article Object for usage

    prepare Article (node) object

    • Load 10+ times in node page
      • check permissions
      • lookup language
      • check revisions
      • ....
    • Include all info of an article
      • these info to be rendered to html

    see also: node_load

    Static array for loaded object

    to speed up multiple load per page
    without cache: 48ms → 7ms

        // Try to load entities from the static cache, if entity type supports
        // static caching.
        if ($this->cache && !$revision_id) {
          $entities += $this->cacheGet($ids, $conditions);
         // If any entities were loaded
         // remove them from the ids still to load.      if ($passed_ids) {
            $ids = array_keys(array_diff_key($passed_ids, $entities));
    source: DrupalDefaultEntityController::load 

    Field cache:  every insert / update

    • Serialized array put into cache bin
    • Cached structure can be used by other module
     a:5:{s:4:"body";a:1:{s:3:"und";a:1:{i:0;a:5:{s:5:"value";s:834:"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam ac pellentesque tellus. Sed ullamcorper, tellus euismod luctus .... faucibus.";s:7:"summary";s:0:"";s:6:"format";s:9:"full_html";s:10:"safe_value";s:846:"<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam ac pellentesque tellus. Sed ullamcorper, tellus euismod luc ... bus.</p>\n";s:12:"safe_summary";s:0:"";}}}s:10:"field_tags";a:1:{s:3:"und";a:3:{i:0;a:1:{s:3:"tid";s:1:"1";}i:1;a:1:{s:3:"tid";s:1:"2";}i:2;a:1:{s:3:"tid";s:1:"3";}}}s:11:"field_image";a:1:{s:3:"und";a:1:{i:0;a:13:{s:3:"fid";s:2:"14";s:3:"alt";s:0:"";s:5:"title";s:0:"";s:5:"width";s:3 :"960";s:6:"height";s:3:"720";s:3:"uid";s:1:"1";s:8: "filename";s:36:"25919_4928590944976_1048974114_n.jpg";s:3:"uri";s:57:"public://field/image/25919_4928590944976_1048974114_n.jpg";s:8:"filemime";s:10:"image/jpeg";s:8:"filesize";s:5:"80014";s:6:"status";s:1:"1";s:9:"timestamp";s:10:"1380443978";s:11:"rdf_mapping";a:0:{}}}}s:14:"field_category";a:0:{}s:12:"field_rating";a:0:{}}


    feature of native form builder

    • Security
      • xss attack prevent - per submission form id
    • Centralized form generation process
    • Can be changed by any other module
    • Can be add new element inherited by other module
      • date picker form inherited from textfield
      • image field inherited from file field
      • date picker can be use for any other modules
       $form['my_other_field_need_date_picker'] = array(
        '#type' => 'date_popup',
        '#title => t('My Date'),

    Form elements types

    checkbox, checkboxes, date, fieldset, file, machine_name, managed_file, password, password_confirm, radio, radios, select, tableselect, text_format, textarea, textfield, vertical_tabs, weight

    But .... very expensive when

    • cross modules to hooks
    • cross modules to alter form
    • gathering all element type
      • to render a form html
      • to fill default values

    Why Drupal Cache Form

    • Cache for multiple-step submission
    • Cache for validate submitted value
    • Cache when errors appear (and doesn't need regenerate)
    • Generate whole form in first step
      • would not regenerate when next step
      • save the submitted value in state cache
      • check sumitted value for detect invalid input


    (Drupal Menu)

    Navigation is complex

    • Permission based
      • Some link only for logged user
      • Some link for administrator, special routing
    • Navigation can be place on many places
      •  Header, footer, developer, account, article
    • Parent-child trails eg. Gallery > Jimmy > Photo

    Navigation cache

    • Cache calculated parent-child relationship
    • Cache navigation tree by permission indexed
    • Menu/Routing cache save 50-80ms per page
      • Cache by user / by permissions

    Cache Design


    Design the Cache

    Generate Component

    Panels handling block

    • 3-party module
    • Design layout yourself
    • Add Drupal blocks into these layout
    • Set cache for each block 

    Each block cache set

    Cache Method

    • Time based cache (simple cache) 
    • Page based cache (cache by url) 
    • Rules based cache (cache by condition)
    • Custom cache  programming

    Cache Ripper, Spider, Engine


    Made with