Cache in Drupal

Quicknote for Drupaler


Jimmy Huang
2013-07-06 @ DrupalCamp Taipei !!


Jimmy Huang


Drupal

Founder, since 2006

NETivism

Founder, since 2009

jimmy at netivism.com.tw





Why share this?





User can see the lastest post

客戶說他的網頁不會更新



Display blank page randomly

隨機出現白畫面



My css effect has gone

CSS的效果不見了



Form can't submit.

本來正常的表單,卻再也送不出去了



Everyone saw broken page but not on my computer.

別人看網頁壞掉,但我看起來是好的



Why my counter not increase?

為什麼我的文章點閱沒增加?




Only a word comes to mind...







Wall of Cache?



Logstalgia


進擊的快取



進擊的使用者...








Peopo.org

  • over 100,000 nodes (not many)
  • over 800,000 node-category relation
  • over 300 query in page (when no cache)
  • over 140 enabled drupal modules
  • heavy query (join large table)

31 ms for render page

to anonymous user

0ms for 304 not modified

to anonymous user


376 ms for render page

to logged in user



Cache by 
Browser






source


Cache control in http header

Cache-Control: public, max-age=21600 
  • Only for anonymous visitor
  • Tell browser take control of cache
  • Tell browser this page will expire after 21600s
  • Drupal already done in default: 
  // If the client sent a session cookie, a cached copy will only be served
  // to that one particular client due to Vary: Cookie. Thus, do not set
  // max-age > 0, allowing the page to be cached by external proxies, when a
  // session cookie is present unless the Vary header has been replaced or
  // unset in hook_boot().
  $max_age = !isset($_COOKIE[session_name()]) || isset($hook_boot_headers['vary']) ? variable_get('page_cache_maximum_age', 0) : 0;
  $default_headers['Cache-Control'] = 'public, max-age=' . $max_age;

read: drupal_serve_page_from_cache

Etag / Expires / Last Modified
in HTML header

  • Only for anonymous visitor
  • Etag:
    • Add unique id to delivered page
  • Last-Modified:
    • Used to detect if page modified since last visit.
  • Expires:
    • Setting by "max-age" of heder cache-control

      // Entity tag should change if the output changes.
      $etag = '"' . $cache->created . '-' . intval($return_compressed) . '"';
      header('Etag: ' . $etag);
    read: drupal_serve_page_from_cache

    Static file cache in browser

      • Do not write CSS / Javascript in page
      • drupal_add_css in ALL PAGE
      • durpal_add_js should consider too


    Cache Page
    Without Hitting PHP









    Nginx

    • Combination of proxy / static http file server
      • We won't hit Dynamic if we don't need
      • Extreme low in memory comsumption
      • Build-in cache mechanism
      • Build-in flood control mechanism

    • Asynchronous event-driven handle request
      • Eat ton of request in 1 process
      • More effective in multi-processor env

    microcache in Nginx

    • Flood Protection / DOS protection
    • Short lifetime ( 1s - 5s )
    • Can used for logged user (use carefully)
    • Never hit PHP when continue hitting

    Boost.module of Drupal

    • Only for anonymous visitor
    • Static HTML page generation when first hit on PHP
    • Use rewrite rule (in Nginx or Apache) to redirect to HTML
    • Expired by
      • Cron
      • Crawler ( drupal 6 only)
      • Node update ( drupal 6 only )

    <!-- Page cached by Boost @ 2013-06-16 02:46:53, expires @ 2013-06-16 05:46:53 -->

    Serve static html from nginx

    10x faster than Apache prefork

    nginx: 100,000 requests in 60s
    apache: 10,000 requests in 60s


    Cache by 
    PHP and Drupal











    What Drupal does from fresh hit?

    Warm up PHP in every request
    1. Load system wide variables
    2. Load menu to check user have permission to access page
    3. Load localized data to prepare translation strings
    4. Load modules to prepare work
    5. Render forms when needed
    6. Gethering content in Blocks of page
    7. Deliver page rendered by template / theme system
    8. Finally we got an page! So slow...

    Run SQL Query in every needs

    OP Code Cache for PHP

    Before: 104MB per page
    Memory used at: devel_boot()=5.43 MB, devel_shutdown()=104.15 MB
    After: 49MB per page
    Memory used at: devel_boot()=2.34 MB, devel_shutdown()=49.58 MB

    OPcode cache in php-apc cache status

    Drupal Cache Stack

    almost cache everything

    1. Page ( only for anonymous user )
    2. Variables
    3. Session cache   ( 3 party module )
    4. Localized string
    5. Module info/registry
    6. Template engine registry
    7. Filtered content
    8. Forms
    9. Aggregated content (Block/Views/Panels)
    10. Aggregated resource (CSS/JS/Image)
     


    Core - Page Cache / Block Cache / CSS and JS cache


    Views Cache (in period or ...)


    Panels Cache  (per page or ...)

    Block Cache (set by module code)



    more on Block Caching

    Recommended Modules for cache

    Cache Strategy

    You should look decent to different strategy for each block 

    • cache in Time Period
    • cache per User
    • cache per Page (URL)
    • cache per User per Page
    • cache per Role per Page

    • others
      • cache per content type
      • cache per node
      • cache per role
      • ...

    Cache Clear Strategy


    • Expired after time period
    • Expired when new version exists
      • clear node cache when node update
      • clear views / panels cache when node update
      • clear user profile page when user update
      • clear term page when term update
    • Expired in specific URL parameter








    more on later


    Query Cache by
    MySQL









    Drupal is a Query Monster

    • Join , Join, Join, Join ....... 
    • Eat 300 query a page easily
      • Save config / variables in DB
      • Save session in DB
      • Save cache in DB
      • Save localized string in DB
      • Save search index in DB
    • Un-optimized query from views
    • Everything combined with node

    Hot news page

    Query Cache Performance

    Before query cache: 2301.12ms

    After query cache: 2.49ms

     SELECT node.title AS node_title, node.nid AS nid, users_node.name AS users_node_name, users_node.uid AS users_node_uid, flag_counts_node.count AS flag_counts_node_count, node_counter.totalcount AS node_counter_totalcount, node_comment_statistics.comment_count AS node_comment_statistics_comment_count, node.created AS node_created, node.uid AS node_uid, 'node' AS field_data_body_node_entity_type, 'node' AS field_data_field_video_id_node_entity_type, 'node' AS field_data_field_img_status_node_entity_type
    FROM 
    {node} node
    LEFT JOIN {flag_counts} flag_counts_node ON node.nid = flag_counts_node.content_id AND flag_counts_node.fid = '1'
    INNER JOIN {users} users_node ON node.uid = users_node.uid
    INNER JOIN {field_data_field_del_status} field_data_field_del_status_value_0 ON node.nid = field_data_field_del_status_value_0.entity_id AND field_data_field_del_status_value_0.field_del_status_value = '0'
    INNER JOIN {node_counter} node_counter ON node.nid = node_counter.nid
    INNER JOIN {node_comment_statistics} node_comment_statistics ON node.nid = node_comment_statistics.nid
    WHERE (( (node.type IN  ('post')) AND( (field_data_field_del_status_value_0.field_del_status_value = '0') )AND (node.status = '1') ))
    ORDER BY node_created DESC
    LIMIT 30 OFFSET 0


    "Watch" your views

    Calculate Hit Rate


    Query Cache Hits    = 2800000
    Quer Cache Insert   = 444400
    Hits/(Hits+Inserts) = 86.30% 

    High hit rate means memory saving.


    Sweet
    Combination








    source

    Server / Service Combination

    Anonymous User


    Logged in User


    • Panels + Panel page cache
      • Cache whole panel based on url
      • Hack module to support per role per page
      • Fix some personal string in ajax
      • do not cache form

    • Use views cache carefully
      • can't run without this
      • headache on Contextual / Exposed filter cache

    • use Cache Actions
      • finally, we can add "update - then - clear cache" rule
      • got to have debug skill at fetch "cache id"



    Debug









    source

    Basic Question


    1. Do the cache enabled?
    2. What I'm seeing, cached or not?
    3. When the cache expired or refresh?
    4. Is that correct when different user see this page?
      user name or profile misplace)
    5. Is that correct when different role see this page?

      (permission reason)

    6. Do the form worked correctly?
    7. you can't cache form (security reason)

    Tools and Hint


    1. Page cache can verify by HTTP header
    2. PHP-APC have watch script
    3. Memcached have script, too
    4. Use devel module to see page generate time
    5. Profiling: XHProf
    6. on large site, never "Clear All Cache"
      this will save you life)
    7. no cached page when anonymous have session
      (added at drupal 7







    May the cache be with you.

    Thank you!