Taming Dexerto.com

Dexerto.com

Traffic

ESports is popular.

 

Constant high traffic with peaks between 15:00 and 22:00

 

 

Initial Findings

  • Regular website outages
  • Mostly database related
  • CraftCMS codebase
  • No template caching
  • 300+ queries per page request
  • Plan discussed to move to full CloudFlare caching but stalled due advert implementation

Initial Findings

  • Hastily applied fix using nginx reverse proxy caching
  • Fixed 60s Cache TTL. No respect for cache headers
  • Admin only content leaking into cache
  • Desktop and mobile clients detected by user agent and cached separately

Step 1: Reduce DB Load

  • Add template caching
    • Target modules shared between pages
  • Replace simple actions with raw php files
    • No need to boot all plugins and do auth checks
  • Profile and remove any slow queries
    • Found a query which was performing a full table scan using 'LIKE' whenever a 404 was encountered. Avg execution time 13s 😬
{% cache globally using key "advert-1-" ~ currentSite.id ~ "-" ~ category.id ~ 
   "-" ~ (craft.app.request.isMobileBrowser(true) ? 'mobile' : '') for 120 seconds %}

Step 1: RESULTS

  • DB hits per page reduced by 60%+
  • Overall DB load reduced by 50%
  • BUT
  • After 3 days of calm the site went down again. Caused by a database backup taking several minutes and creating hundreds of database connections. Root cause was identified as Craft's cache elements table growing to 2GB.
  • Truncated table, disabled Craft template caching

Step 2: Unify HTML

  • Rewrite front end ad code to work across both mobile and desktop devices.
    • Had to learn some of the intricacies of Google Publisher Tag in order to get this working
  • Remove admin functionality from front end templates
  • No longer necessary to cache mobile and desktop individually, remove this from nginx config

Step 2: Results

  • Consolidated HTML output across mobile and desktop and for admin/non-admin users
  • ~30% reduction in article requests hitting Craft by removing mobile specific caching
  • Cache no longer being polluted with admin-only content

Step 3: Dynamic Content

  • Remove regularly updated content from main page templates
  • Create ajax endpoints to load these bits of content async
  • Create JS to pull this content into appropriate pages immediately on page load

Step 3: Result

  • Article page content can now be cached for long periods of time whilst still allowing 'trending' content to update regularly
  • Cache headers can still be set for trending modules but with a much lower TTL
  • Trending content modules are shared across pages. By giving it its own URL it can be cached independently reducing overall DB hits

Step 4: Cloudflare

  • Set long TTL cache headers for articles
  • Set short TTL cache headers for dynamic content
  • Activate full page caching in CloudFlare
  • Create page rules in CF for special cases:
    • /admin
    • /actions
{% set expiryTime = expiryTime ?? 60 %}

{% if currentUser or preventAllPageCaching %}
	{% header "Cache-Control: no-store, no-cache, must-revalidate" %}
{% else %}
    {% set expiry = now|date_modify('+' ~ expiryTime ~ ' seconds') %}
	{% header "Cache-Control: public, max-age=60, s-maxage=" ~ expiryTime %}
	{% header "Pragma: cache" %}
	{% header "Expires: " ~ expiry|date('D, d M Y H:i:s', 'GMT') ~ " GMT" %}
{% endif %}

Step 4: Result

  • 90% of article requests are now served directly from CloudFlare's edge cache
  • Dynamic content mostly served from CF but still refreshed every 60s
  • Average cached TTFB ~70ms on a broadband connection
  • I'm not allowed to look at GA, but I expect bounce rate has improved significantly!

Step 5: Busting

  • Create a Craft plugin to bust the CF cache when an article is saved
  • Only bust the article's URL and any index pages on which the article appears (to make sure title and image changes are reflected throughout)
  • Rules to figure out all the URLs were somewhat complex so couldn't use the existing plugin 'upper'
Event::on(
    Elements::class,
    Elements::EVENT_AFTER_SAVE_ELEMENT,
    function (Event $event) {
        $entry = $event->element;
        if ($entry->refHandle() === 'entry') {
            try{
                self::$plugin->cloudflareService->invalidateForEntry($entry);
            }catch(\Exception $e){
                Craft::$app->session->setError($e->getMessage());
            }
        }
    }
);

Step 5: Result

  • Cache invalidation via CF's API takes around 10 seconds so article updates are made live pretty much immediately
  • Cache busts are only applied to a small number of URLs so they don't result in big DB spikes

Step 5: Problem

  • Article pages are only cache busted when the article is saved
  • If content contained within the page, but not attached directly to the article is updated the cache isn't purged. E.G. Advert code which sits inline in articles
  • Cache busting all articles when ad code is updated would bring down the site
  • Currently OK to wait until caches naturally expire, max 1 hour

Next Steps

  • Most of the remaining DB load caused by tracking view counts on articles and tags
  • This is used to calculate trending articles and tags
  • Pulling this functionality out to another, non-db reliant service would be a good next step. Serverless functions would probably work well for this

 

  • Template code still needs cleaning up to reduce the number of database queries generated when a page cache expires
  • This will require a significant refactor and will need to carefully maintain existing business logic

THanks!

Q's?

@mattgrayisok

Made with Slides.com