Cache Design in Drupal
for Developers
Jimmy Huang at
PHPConf 2013, Taipei, Taiwan
Why Drupal?
- Drupal is slow without Cache
- Drupal's flexibility heavily depends on cache design
-
I love Drupal - already fall in love over 8 years
- Many developers verify these design
Without Cache → Cached
350ms → 170ms
Drupal 7 clean install
Without Cache → Cached
5,977ms → 627 ms
-
Over 10 custom fields
- Custom Categories, Tag
- Maps, Location info
- Author Statistics
- Related Articles
Best open source choice of my life
- Jimmy Huang
Drupaltaiwan.org (2006)
Drupalcamp Taipei (2012)
Founder of Drupal "Dries" on the screen
Drupalcamp Hackthon 2013
for open street map tw / ubuntu taiwan
Become "Real" Drupaler
Multi-Language Sites
One-to-many sub site
Forum / Community Website
EC/Shopping Cart
Stackoverflow like Website
News Portal / Videos Portal
Enterprise Information Portal
Crown Founding(Kickstarter Like)
Food traceability system
CRM / Event Register (With CiviCRM)
Congress data info
Mobile Content Backend
Most Important: Developers
2001 ~ 2013 commit log visualization of Drupal
If WordPress is what Web designers choose,
Drupal is what Web Developers choose.
- Andrew Oliver
Views make me lazy
Content Type make me boring
Flexibility perfect but...
Slow... without tuning / caching
Drupal here!!
Everything
Cached
in modern web app
Cache design highlight in Drupal
- Consider Static Resources
- CSS, Javascript, Images, CDN support
-
Static HTML (3-party)
-
Different Bin for different usage or module
- Core: Page cache, Config cache, object cache, routing ...
-
Custom Module: Rendered content cache ...
- Swappable Cache engine
- Database (core)
- Memcache, APC, File based, AWS (3-party)
Static Resources
304 not modify
when visit Cached Page
-
Tell browser take control of cache
- Calculate cache lifetime for browser
- check cookie / session time
- check "Max Age" settings in Drupal
Cache-Control
Header
Cache-Control: public, max-age=21600
Code
// If the client sent a session cookie, a cached copy will only be served
// to that one particular client due to Vary: Cookie. Thus, do not set
// max-age > 0, allowing the page to be cached by external proxies, when a
// session cookie is present unless the Vary header has been replaced or
// unset in hook_boot().
$max_age = !isset($_COOKIE[session_name()]) || isset($hook_boot_headers['vary']) ? variable_get('page_cache_maximum_age', 0) : 0;
$default_headers['Cache-Control'] = 'public, max-age=' . $max_age;
Etag, Last-Modified, Expires
- Always send expire header in early year
-
Let Etag and Last-Modified to take control expires
- Assign cache created time to Last-Modified
-
that the correct meaning of last-modified
- Assign cache created time to Etag
-
Cheap unique id for this page
- Sync with Last-Modified
Last-modified
default_headers['Last-Modified'] = gmdate(DATE_RFC1123, $cache->created);
ETag
$etag = '"' . $cache->created . '-' . intval($return_compressed) . '"';
Expires
// HTTP/1.0 proxies does not support the Vary header, so prevent any caching // by sending an Expires date in the past. HTTP/1.1 clients ignores the // Expires header if a Cache-Control: max-age= directive is specified // 2616, section 14.9.3).
$default_headers['Expires'] = 'Sun, 19 Nov 1978 05:00:00 GMT';
see also: drupal_serve_page_from_cache
CSS / JS aggregation
cross modules
JS / CSS in modules
Situation
- Different modules have own css / js
- Different modules in different directory
- Some module use jQuery plugin, some not
- Themes has own css, javascript
-
IE have 30 css file limitation
Solution
- Put css / js into static array
- Output when all the modules/themes done
- Aggregate files to decrease front-end connection
After Aggregation
13 files → 4 files
Add Resource / Build Cache
drupal_add_js (50 calls)
drupal_add_css (42 calls)
drupal_build_js_cache
drupal_build_css_cache
Template just a single variable
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="<?php print $language->language; ?>" version="XHTML+RDFa 1.0" dir="<?php print $language->dir; ?>"<?php print $rdf_namespaces; ?>>
<head profile="<?php print $grddl_profile; ?>">
<?php print $head; ?>
<title><?php print $head_title; ?></title>
<?php print $styles; ?>
<?php print $scripts; ?>
</head>
<body
Image Cache
Custom Image Size and Style
Cache resized image based on layout
"Cache" Image by style
-
Check if resized image exists
-
If not, generate image base on style
-
Cache generated image
-
Next time will serve static img directly
-
Check if resized image exists
-
If not, generate image base on style
-
Cache generated image
-
Next time will serve static img directly
Image cache: file test than passed to PHP
Apache: mod_rewrite config
# Pass all requests not referring directly to files # in the filesystem to # index.php. Clean URLs are handled in # drupal_environment_initialize().
RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^ index.php [L]
Nginx: try_files config
location ~ ^/sites/.*/files/styles/ {
try_files $uri @rewrite; }
location @rewrite { rewrite ^ /index.php; }
Save request of PHP
-
1,500ms → 600ms
- Use keep-alive to serve multiple image in 1 http
Static HTML Cache
(third-party module)Poorman's high performance cache
- File exists then serve cached html
- Same as image cache
- Anonymous visitor only
- Drupal deliver whole page cache to anonymous only
- Check cookie to detect anonymous in Web Server
-
Apache:
RewriteCond %{HTTP_COOKIE} SESS
-
Nginx
map $http_cookie $no_cache { default 0; ~SESS 1; # PHP session cookie }
See also: Boost module
Bootstrap cache
settings, translations, modules, autoload ...
Bootstrap in Drupal
Drupal will bootstrap on every visit
When visit a url "http://example.com/node/123":
- Page cache exists ? return cache
- Load global settings (variable)
- Load sessions, user object
- Load translations
- routing permission of this url ? return 403, 404
-
Load System Modules
-
Autoload prepared
→ Execute page result
Page Cache
Save printed HTML to cache
-
After Render all the element (drupal_deliver_html_page)
- Before the end, ob_get_clean to save all HTML
- Next visit, deliver whole page
Feature
- deliver to Anonymous user only
- support any type of cache bin (varnish, memcached)
- store gzipped version to save both CPU/bandwidth
Performance compare sheet
Cache Type | 100 Page Avarage |
---|---|
none | 414ms per page |
Database (Drupal default) | 53ms per page |
Memcached | 32ms
per page
|
Boost | 0.264ms per page |
Tested in Linode 1024, Nginx + PHP 5.3
Settings
Every module save config easily
Save config - variable_set
$custom_var = array( 'test1' => 1, 'test2' => 2, );
variable_set('mymodulename_custom_var', $custom_var);
Retrive config - variable_get
$var = variable_get('mymodulename_custom_var', array());
500+ serialize blob record in DB
All the settings save to database table.
Cache all config to single record
- Clear cache when save new config
- Retrieve cache on every bootstrap
With / without variable cache
saved 30-50ms in large site
see also: _drupal_bootstrap_variables
Translations
5000+ strings saved in database
- Every module can use t() to translate string
- 1000+ calls of t() per page
- Cache strings into 1 record, to save DB overhead
- Use static array to save duplicate string loading
- Cache clear when translate string
Modules
&
Autoload
Drupal is Module based system
- 100~200 modules for modern website
- 500+ calls per request - Lookup module frequently
- Scan whole directory - io overhead
Module List Cache
- Cache refresh when enable / disable module
- 1 record saved whole directory scan
see also: system_list
Autoload
-
Registry Class into database when module enabled
- Autoload Class related file when exists
- Save database class mapping into cache
- Autoload list from cache without DB overhead
overhead for works together
- Modules work together by hooks even independently
- Every hook call function_exists loops all modules
-
200+ calls of module_implements, 100 module enable
→ loop 200*100 to check function_exists
function profile_form_alter(&$form, &$form_state, $form_id) {
if (($form_id == 'user_register_form' || $form_id == 'user_profile_form')) {
// modify form element here...
}
}
Cache on Module hooks
- Run all the loops first time
- Cache by hook indexed array after page load
- Next time, just loop cached hook
- When invoke specific hook, only loop array[hook]
- cached : 1000+ function_exists calls (3ms)
- no cache: 20000+ function_exists calls (70ms)
see also: module_implements, module_implements_write_cache
Content Cache
Article
Article field in clicks in Drupal
but ... every field is a single table entity
When visit an Article, we need to...
- Join (1 + numer of field) tables - DB overhead
- Parse text to specific format - PHP overhead
- parse wiki syntax, bbcode, remove un-secure html ..
- Prepare Article Object for usage
prepare Article (node) object
- Load 10+ times in node page
- check permissions
- lookup language
- check revisions
- ....
- Include all info of an article
- these info to be rendered to html
see also: node_load
Static array for loaded object
to speed up multiple load per page
without cache: 48ms → 7ms
// Try to load entities from the static cache, if entity type supports // static caching. if ($this->cache && !$revision_id) { $entities += $this->cacheGet($ids, $conditions); // If any entities were loaded // remove them from the ids still to load.
if ($passed_ids) { $ids = array_keys(array_diff_key($passed_ids, $entities)); } }
Field cache: every insert / update
-
Serialized array put into cache bin
- Cached structure can be used by other module
a:5:{s:4:"body";a:1:{s:3:"und";a:1:{i:0;a:5:{s:5:"value";s:834:"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam ac pellentesque tellus. Sed ullamcorper, tellus euismod luctus .... faucibus.";s:7:"summary";s:0:"";s:6:"format";s:9:"full_html";s:10:"safe_value";s:846:"<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam ac pellentesque tellus. Sed ullamcorper, tellus euismod luc ... bus.</p>\n";s:12:"safe_summary";s:0:"";}}}s:10:"field_tags";a:1:{s:3:"und";a:3:{i:0;a:1:{s:3:"tid";s:1:"1";}i:1;a:1:{s:3:"tid";s:1:"2";}i:2;a:1:{s:3:"tid";s:1:"3";}}}s:11:"field_image";a:1:{s:3:"und";a:1:{i:0;a:13:{s:3:"fid";s:2:"14";s:3:"alt";s:0:"";s:5:"title";s:0:"";s:5:"width";s:3 :"960";s:6:"height";s:3:"720";s:3:"uid";s:1:"1";s:8: "filename";s:36:"25919_4928590944976_1048974114_n.jpg";s:3:"uri";s:57:"public://field/image/25919_4928590944976_1048974114_n.jpg";s:8:"filemime";s:10:"image/jpeg";s:8:"filesize";s:5:"80014";s:6:"status";s:1:"1";s:9:"timestamp";s:10:"1380443978";s:11:"rdf_mapping";a:0:{}}}}s:14:"field_category";a:0:{}s:12:"field_rating";a:0:{}}
Form
feature of native form builder
- Security
- xss attack prevent - per submission form id
- Centralized form generation process
- Can be changed by any other module
- Can be add new element inherited by other module
- date picker form inherited from textfield
- image field inherited from file field
- date picker can be use for any other modules
$form['my_other_field_need_date_picker'] = array(
'#type' => 'date_popup',
'#title => t('My Date'),
....
);
Form elements types
checkbox, checkboxes, date, fieldset, file, machine_name, managed_file, password, password_confirm, radio, radios, select, tableselect, text_format, textarea, textfield, vertical_tabs, weight
But .... very expensive when
- cross modules to hooks
- cross modules to alter form
- gathering all element type
- to render a form html
- to fill default values
Why Drupal Cache Form
- Cache for multiple-step submission
- Cache for validate submitted value
- Cache when errors appear (and doesn't need regenerate)
- Generate whole form in first step
- would not regenerate when next step
- save the submitted value in state cache
- check sumitted value for detect invalid input
read: drupal_build_form
Navigation
(Drupal Menu)
Navigation is complex
- Permission based
- Some link only for logged user
-
Some link for administrator, special routing
- Navigation can be place on many places
- Header, footer, developer, account, article
- Parent-child trails eg. Gallery > Jimmy > Photo
vs
Navigation cache
- Cache calculated parent-child relationship
- Cache navigation tree by permission indexed
- Menu/Routing cache save 50-80ms per page
- Cache by user / by permissions
Cache Design
Design the Cache
Generate Component
Panels handling block
- 3-party module
- Design layout yourself
- Add Drupal blocks into these layout
- Set cache for each block
Each block cache set
Cache Method
- Time based cache (simple cache)
- Page based cache (cache by url)
- Rules based cache (cache by condition)
- Custom cache programming
- read Panels Hash Cache implementation
Cache Ripper, Spider, Engine
- Shell script to warm specific page
- module: cache warmer
- Expire cache from cron, event trigger or shell
- module: expire, purge, cache actions
- Cache Engine (storage)
- External service bridge
Question
Cache in Drupal - for Developer
By Jimmy Huang
Cache in Drupal - for Developer
- 8,034