HHVM: The Fiddly Bits

Ian Littman

@iansltx

http://ian.im/hh15

whoami

  • Back-end dev for Access Empowerment
  • I build APIs...and occasionally stuff in front of them
  • I use HHVM in production for a non-AE project
  • iansltx on Twitter, GitHub, most other places

What we'll talk about

  • HHVM for PHP
    • The theory behind it all
    • Basic setup
    • Performance (benchmarks!)
    • Configuration gotchas
    • Compatibility gotchas
  • HHVM for Hack
    • Basic setup
    • Language features
    • What does it look like in practice?
    • Is it faster than PHP on HHVM?
  • Case studies

The Theory

  • Code doesn't change that much during runtime, even for a dynamic language like PHP
  • If we run the code a bit, we'll figure out the parts that are predictable from a type definition perspective
  • If we can make (correct) assumptions about types, etc. within a block of code, we can omit type-checking code within that block
  • In some cases, we can go all the way to machine code with the above optimizations...for a nice performance boost

The Implementation

  1. Profile the code a bit in interpreter-only mode
    1. 11 times by default
    2. hhvm.eval.jit_warmup_requests = int
    3. Figure out which types are seen most
  2. JIT compile hot code at runtime
    1. HHBC -> HHIR
    2. HHIR -> VASM (arch independent)
    3. VASM -> ASM/machine code (x86-64)
  3. Bail out to interpreter mode if needed

How did we get to HHBC?

  • Depends on whether we're in Repo Authoritative mode
  • Interpreter mode (PHP5/7-like)
    • On-request...
    • Parse, lex, build Abstract Syntax Tree
    • Optimize obvious cases
    • Build HipHop Assembly codes
    • Build HipHop Bytecode (HHBC)
  • RepoAuth mode (ahead-of-time analysis step)
    • Static analysis to eliminate dead code, optimize
    • All of the above, up to HHBC
    • Store HHBC in a SQLite database, read on runtime

How do I get RepoAuth?

  • hhvm --hphp -thhbc -o {dir to place hhvm.hhbc} --input-list {list of filenames to compile}
  • /etc/hhvm/server.ini
    • hhvm.repo.
      • authoritative = true
      • central.path = {path to hhbc file}
    • hhvm.server.
      • source_root = {path used in hhbc compilation}
  • ~20% speed improvement

RepoAuth example

Unable to stat file vendor/aura/sql/tests/globals.dist.php
Unable to stat file vendor/aura/sql/tests/globals.php
Unable to stat file vendor/slim/slim/tests/Slim/Slim.php
Unable to stat file vendor/twilio/sdk/PEAR/PackageFileManager/File.php
Unable to stat file vendor/twilio/sdk/PEAR/PackageFileManager2.php
...
Unable to stat file vendor/twilio/sdk/tests/Twilio/RequestValidator.php
Unable to stat file vendor/twilio/sdk/tests/Twilio/Twiml.php
analyzeProgram...
analyzeProgram took 0'00" (77696 us) wall time
parsing inputs took 0'00" (170148 us) wall time
pre-optimizing...
pre-optimizing took 0'00" (269137 us) wall time
creating binary HHBC files...
creating binary HHBC files took 0'01" (1539664 us) wall time
all files saved in /var/run/hhvm ...
running hphp took 0'02" (2506087 us) wall time

Config Gotchas

  • Thread count is defaulted to 2 * core count
    • For smaller machines, kills high-concurrency perf
    • hhvm.server.thread_count = int
  • Logging (hhvm.error_handling.)
    • call_user_handler_on_fatals = 0
      • Error code 2^24 if turned on...catch all the fatals!
    • throw_exception_on_bad_method_call = 1
      • SPL exception...different in some cases than PHP5
  • func_get_args() will get you current arg values if changed
    • PHP 7 does this too
  • No MySQL PDO over SSL
  • No easter_date()

Benchmarks!

Setup

  • Packages exist for Ubuntu
  • Uses FastCGI, just like php-fpm
  • Compiling takes an hour or two; see the GitHub wiki for instructions
    • CentOS 7
    • Amazon Linux
    • Short-circuit Intel TBB's test suite to go faster
  • Drop-in replacement for PHP5

Hacklang

  • The reason PHP 7 has a bunch of new language features
  • A superset of PHP, minus a few ugly bits
  • Currently syntactic sugar, plus async support, plus static analysis and type-checking
  • Minimal performance impact for sequentially executed code...for now

How do I write it?

  • Plugins
    • Sublime
    • GitHub Atom
    • Vim
    • Emacs
  • Support in PHPStorm coming soon

Language Features

  • Types...types, everywhere!
    • Scalar typehints (rather strict)
    • Return typehints (including scalars)
    • Nullable types, e.g. function bar(?string $bar)
    • Generics
    • Collection types
      • Vector
      • Map
      • Set
      • Pair

Language Features

  • Shapes (lightweight typed structures)
  • Type aliases
  • Annotations (not docblocks)
    • <<__annotation>> syntax
    • e.g. <<__Memoize>>
  • XHP - HTML as a first-class citizen
    • Escaping for "free"
    • Extensible syntax, e.g. for Bootstrap
    • Available as a PHP extension

Language Features

  • async/await
    • Multithreading without callback hell
    • Non-blocking I/O
      • curl
      • files
      • memcached
      • mysql (as of 3.6!)
    • Here's the big perf. improvement vs. PHP

Language Features

  • Type checker (hh_client)
    • Watches file system
    • On-save, not run at compile/runtime
    • Detailed information on type errors etc.
    • Doesn't catch everything...
    • ...but that may be because I mixed Hack and PHP

Typechecker output

/usr/share/nginx/html/public/index.php:75:26,37: Too few arguments (Typing[4104])
  /tmp/hh_server_root/hhi_1e44a56a/collections/Vector.hhi:50:19,29: Definition is here
/usr/share/nginx/html/public/index.php:80:33,36: Could not find method push in an object of type Vector (Typing[4053])
  /usr/share/nginx/html/public/index.php:75:26,37: This is why I think it is an object of type Vector
  /tmp/hh_server_root/hhi_1e44a56a/collections/Vector.hhi:45:13,18: Declaration of Vector is here
/usr/share/nginx/html/public/index.php:80:33,36: Could not find method push in an object of type Vector (Typing[4053])
  /usr/share/nginx/html/public/index.php:75:26,37: This is why I think it is an object of type Vector
  /tmp/hh_server_root/hhi_1e44a56a/collections/Vector.hhi:45:13,18: Declaration of Vector is here
/usr/share/nginx/html/public/index.php:92:34,38: Invalid argument (Typing[4110])
  /usr/share/nginx/html/public/index.php:102:34,36: This is an int
  /usr/share/nginx/html/public/index.php:91:33,38: It is incompatible with a string
/usr/share/nginx/html/public/index.php:92:34,38: Invalid argument (Typing[4110])
  /usr/share/nginx/html/public/index.php:102:34,36: This is an int

Yep, you can mix 'em

  • Can call Hack from PHP, and vice versa
  • Hackificator (the reason most of Fb's code is Hack)
  • Can translate back and forth automatically*
    • * Can't do with async/await
  • ...so you can start with a single file and work outward
  • Here's what I did in an afternoon...raphple runs on it

That's it!

Questions?

@iansltx / http://ian.im/hh15