Cache in Drupal
Quicknote for Drupaler
Jimmy Huang
2013-07-06 @ DrupalCamp Taipei !!
Jimmy Huang
Drupal
Founder, since 2006
NETivism
Founder, since 2009
jimmy at netivism.com.tw
Why share this?
User can see the lastest post
客戶說他的網頁不會更新
Display blank page randomly
隨機出現白畫面
My css effect has gone
CSS的效果不見了
Form can't submit.
本來正常的表單,卻再也送不出去了
Everyone saw broken page but not on my computer.
別人看網頁壞掉,但我看起來是好的
Why my counter not increase?
為什麼我的文章點閱沒增加?
Only a word comes to mind...
Wall of Cache?
進擊的快取
進擊的使用者...
Peopo.org
-
over 100,000 nodes (not many)
- over 800,000 node-category relation
- over 300 query in page (when no cache)
- over 140 enabled drupal modules
- heavy query (join large table)
31 ms for render page
to anonymous user
0ms for 304 not modified
to anonymous user
376 ms for render page
to logged in user
Cache by
Browser
Cache control in http header
Cache-Control: public, max-age=21600
- Only for anonymous visitor
- Tell browser take control of cache
- Tell browser this page will expire after 21600s
- Drupal already done in default:
// If the client sent a session cookie, a cached copy will only be served
// to that one particular client due to Vary: Cookie. Thus, do not set
// max-age > 0, allowing the page to be cached by external proxies, when a
// session cookie is present unless the Vary header has been replaced or
// unset in hook_boot().
$max_age = !isset($_COOKIE[session_name()]) || isset($hook_boot_headers['vary']) ? variable_get('page_cache_maximum_age', 0) : 0;
$default_headers['Cache-Control'] = 'public, max-age=' . $max_age;
Etag / Expires / Last Modified
in HTML header
- Only for anonymous visitor
-
Etag:
- Add unique id to delivered page
- Last-Modified:
- Used to detect if page modified since last visit.
- Expires:
- Setting by "max-age" of heder cache-control
// Entity tag should change if the output changes.
$etag = '"' . $cache->created . '-' . intval($return_compressed) . '"';
header('Etag: ' . $etag);
Static file cache in browser
- Do not write CSS / Javascript in page
- drupal_add_css in ALL PAGE
- durpal_add_js should consider too
Nginx
- Combination of proxy / static http file server
- We won't hit Dynamic if we don't need
- Extreme low in memory comsumption
- Build-in cache mechanism
- Build-in flood control mechanism
- Asynchronous event-driven handle request
- Eat ton of request in 1 process
- More effective in multi-processor env
microcache in Nginx
-
Flood Protection / DOS protection
- Short lifetime ( 1s - 5s )
- Can used for logged user (use carefully)
- Never hit PHP when continue hitting
Boost.module of Drupal
- Only for anonymous visitor
- Static HTML page generation when first hit on PHP
- Use rewrite rule (in Nginx or Apache) to redirect to HTML
- Expired by
- Cron
- Crawler ( drupal 6 only)
- Node update ( drupal 6 only )
<!-- Page cached by Boost @ 2013-06-16 02:46:53, expires @ 2013-06-16 05:46:53 -->
Serve static html from nginx
10x faster than Apache prefork
nginx: 100,000 requests in 60s
apache: 10,000 requests in 60s
What Drupal does from fresh hit?
Warm up PHP in every request
- Load system wide variables
- Load menu to check user have permission to access page
- Load localized data to prepare translation strings
- Load modules to prepare work
- Render forms when needed
- Gethering content in Blocks of page
- Deliver page rendered by template / theme system
- Finally we got an page! So slow...
Run SQL Query in every needs
OP Code Cache for PHP
Before: 104MB per page
Memory used at: devel_boot()=5.43 MB, devel_shutdown()=104.15 MBAfter: 49MB per pageMemory used at: devel_boot()=2.34 MB, devel_shutdown()=49.58 MB
OPcode cache in php-apc cache status
Drupal Cache Stack
almost cache everything
- Page ( only for anonymous user )
- Variables
- Session cache ( 3 party module )
- Localized string
- Module info/registry
- Template engine registry
- Filtered content
- Forms
- Aggregated content (Block/Views/Panels)
-
Aggregated resource (CSS/JS/Image)
Views Cache (in period or ...)
Panels Cache (per page or ...)
Block Cache (set by module code)
more on Block Caching
Recommended Modules for cache
-
Memcache API and Integration or Memcache Storage
- save cache into memcached (instead hitting database)
-
Views Content Cache
- cache views based on content save
more effective because not time period
- Panel Page Cache or Panel Hash Cache
- cache panels based on url / user / content type
- Cache Actions
- setting cache clear rules in simple clicks (with rules)
- Cache Warmer
- Claw specific pages you need.
Cache Strategy
You should look decent to different strategy for each block
- cache in Time Period
- cache per User
- cache per Page (URL)
- cache per User per Page
- cache per Role per Page
- others
- cache per content type
- cache per node
- cache per role
- ...
Cache Clear Strategy
- Expired after time period
- Expired when new version exists
- clear node cache when node update
- clear views / panels cache when node update
- clear user profile page when user update
- clear term page when term update
- Expired in specific URL parameter
more on later
Drupal is a Query Monster
- Join , Join, Join, Join .......
- Eat 300 query a page easily
- Save config / variables in DB
- Save session in DB
- Save cache in DB
- Save localized string in DB
- Save search index in DB
- Un-optimized query from views
- Everything combined with node
Hot news page
Query Cache Performance
Before query cache: 2301.12ms
After query cache: 2.49ms
SELECT node.title AS node_title, node.nid AS nid, users_node.name AS users_node_name, users_node.uid AS users_node_uid, flag_counts_node.count AS flag_counts_node_count, node_counter.totalcount AS node_counter_totalcount, node_comment_statistics.comment_count AS node_comment_statistics_comment_count, node.created AS node_created, node.uid AS node_uid, 'node' AS field_data_body_node_entity_type, 'node' AS field_data_field_video_id_node_entity_type, 'node' AS field_data_field_img_status_node_entity_type
FROM
{node} node
LEFT JOIN {flag_counts} flag_counts_node ON node.nid = flag_counts_node.content_id AND flag_counts_node.fid = '1'
INNER JOIN {users} users_node ON node.uid = users_node.uid
INNER JOIN {field_data_field_del_status} field_data_field_del_status_value_0 ON node.nid = field_data_field_del_status_value_0.entity_id AND field_data_field_del_status_value_0.field_del_status_value = '0'
INNER JOIN {node_counter} node_counter ON node.nid = node_counter.nid
INNER JOIN {node_comment_statistics} node_comment_statistics ON node.nid = node_comment_statistics.nid
WHERE (( (node.type IN ('post')) AND( (field_data_field_del_status_value_0.field_del_status_value = '0') )AND (node.status = '1') ))
ORDER BY node_created DESC
LIMIT 30 OFFSET 0
"Watch" your views
Calculate Hit Rate
Query Cache Hits = 2800000
Quer Cache Insert = 444400
Hits/(Hits+Inserts) = 86.30%
High hit rate means memory saving.
Sweet
Combination
Server / Service Combination
- PHP-APC
- Almost 100% hit rate
- Watch out - some version ignore .module file
- Do not use in multiple virtualhost environment
- MySQL Query Cache
-
32MB memory get huge performance enhance
- Remember calculate hit rate
-
Memcached
- client-server architect
- use even behind balancer
- Session should save here, but D7 still problem
Anonymous User
-
microcache of nginx
- 1-3s cache time is enough
- ajax needed for Drupal node page counting
- Best practice at perusio/ drupal-with-nginx
- Page Cache to memcached
- drupal core mechnism but save cache to memory
- Trouble-less, very stable, but still hitting PHP
- Handle session, status message, HTTP header correctly
- recommend set cache max lifetime
- Cache Warmer of drupal
- simple drush command to crawl specific page in cron
Logged in User
-
Panels + Panel page cache
- Cache whole panel based on url
- Hack module to support per role per page
- Fix some personal string in ajax
- do not cache form
- Use views cache carefully
- can't run without this
-
headache on Contextual / Exposed filter cache
- use Cache Actions
- finally, we can add "update - then - clear cache" rule
- got to have debug skill at fetch "cache id"
Debug
Basic Question
-
Do the cache enabled?
-
What I'm seeing, cached or not?
- When the cache expired or refresh?
-
Is that correct when different user see this page?
(user name or profile misplace) -
Is that correct when different role see this page?
(permission reason)
- Do the form worked correctly?
- you can't cache form (security reason)
Tools and Hint
- Page cache can verify by HTTP header
- PHP-APC have watch script
- Memcached have script, too
- Use devel module to see page generate time
- Profiling: XHProf
- on large site, never "Clear All Cache"
(this will save you life) - no cached page when anonymous have session
(added at drupal 7)
May the cache be with you.
Thank you!
Cache in drupal
By Jimmy Huang
Cache in drupal
- 8,166