Analyzing and optimizing native code for portability to the Web using Emscripten, Native Client and ConConJS


Raúl G. Roa

 Agenda


  • Glossary
  • Motivation
  • Previous Work
  • Methodology
  • Artifact — ConConJS
  • Future Work
  • Conclusion




Glossary

Native Code




Native Code or Machine code is a set of instructions executed directly by a computer's central processing unit (CPU).

Machine Code



W65C816S Machine Code

Assembly Code




An assembly language is a low-level programming language for a computer, or other programmable device, in which there is a very strong (generally one-to-one) correspondence between the language and the architecture's machine code instructions.

Assembly Code


"Hello world!" program for DOS in MASM style assembly

Unmanaged Code




Unmanaged code refers to code that does not need a runtime to execute. Often written in a programming language such as C or C++, which is compiled directly into machine code.


Managed Code



 Managed code, refers to code which is written in programming languages like C#, VB.NET, Java, or similar, and executed in a virtual environment (such as the .NET CLR or the JVM) which “simulates” a CPU in software.




Motivation





How can we make our large native code based graphic accelerated applications portable to all modern devices at the same time?

Without:
  • Having to write platform specific code
  • Higher costs

Software Portability





Software portability is often cited as desirable, but rarely receives systematic attention.

"Most of its basis are anecdotes and case studies."

Why Portability


  • Platform agnostic code
    • Unified code base
  • Portable software is more robust
  • Portability expands your market
  • Portability provides freedom






"Write once, run everywhere"






Many programming languages and API's claim to be "portable"




We will focus in C, C++ and OpenGL

C and C++ are everywhere


C and C++ Relevance



  • Since it supports many target platforms from a single code base, nearly all professional game development middle-ware is written in C/C++

  • Good for large code-bases (static type system, compiled language, mature compilers and debuggers)

OpenGL is also everywhere (almost)







But writing code in these technologies does not ensure portability.

Regardless...




All modern devices can be a single target platform...

The Web


  • Biggest open & standardized platform
  • Great for reaching people
  • Has evolved organically in the digital age
  • Beaten all odds and outlasted all adversity
  • Has scaled successfully beyond all reason

Why not make our content available for everyone?

Web's Standardized Stack

Document Object Model
  • HTML 5
  • CSS 3.0
  • JavaScript (Ecmascript [ECMA-262 specification and ISO/IEC 16262] )

Rendering API's
  • CanvasRenderingContext2D
  • WebGL






BUT...

Game developers don't do Web Stuff.



"HTML is not even a programming language."

"JavaScript is slow."

"There is a large amount of tools built using established languages, such as C and C++, that need to be ported."






What then?



What if?



C/C++ => %$&# => Web






Almighty LLVM to the rescue.

Typical C/C++ Compilation Model

LLVM Compilation Model





LLVM Compilation Model



LLVM

The LLVM Project is a collection of modular and reusable compiler and tool-chain technologies.





Demo






Also, other things had to happen...

JavaScript Engines


Late 2008/early 2009 the race for fast JavaScript starts

  • V8 (Google)
  • TraceMonkey (Mozilla)
  • and Nitro (Safari)

asm.js


  • Is a JavaScript specification
  • It's just (a subset of) JavaScript
  • Allows the Web to become a compilation target



SOLVED!



C/C++ => LLVM => Web




Previous Work

Previous Work


How to run native code in the Web

  • Plugins Limited browser specific API
    • NPAPI
    • PPAPI
    • ActiveX

Plugins


Web browser

NPAPI

PPAPI

ActiveX

Internet Explorer

 

 

X

Chrome/Chromium

 

X

 

Firefox

X

 

 

Safari

X

 

 

Opera

X

 

 

Previous Work


How to run native code in the Web

  • Plugins — Limited browser specific API

  • Native Client — Sandbox for native code, only works in Chrome Web browsers

Native Client

Native Client Application
Typical Web Application

Native Client Tool-chain Workflow






Demo

Native Client

The Good

  • Performance is similar to native code. (5 - 15 percent slowdown)
  • C/C++ Standard Library support
  • Threading support
  • Dynamic Linking support (when using NaCl)

Limitations

  • Since I/O operations are sand boxed code needs to be adapted
  • Only works on Chrome
  • Only static linking support (when using PNaCl)

Previous Work


How to run native code in the Web

  • Plugins — Limited browser specific API

  • Native Client — Sandbox for native code, only works in Chrome Web browsers

  • JavaScript cross-compilation (Emscripten) — Works in all Web Browsers, performance hit

Emscripten Tool-chain Workflow






Demo

Emscripten

The Good

  • Runs in all browsers with JavaScript support
  • C/C++ Standard Library support
  • Regular C/C++ I/O support (emulated)

Limitations

  • No threading support (out of the box)
  • Performs slower than native code (50 - 100 percent slowdown)
  • Only static linking

Previous Work

  • JavaScript engines have gotten fast enough to run large compiled code bases
    • Compiled JavaScript can be faster than "regular" handwritten JavaScript

Cross-compiling to JavaScript is not new:





Methodology

Intention




Evaluate the challenges of porting large graphics applications native code bases written in C and C++ to run on the Web, using Native Client and Emscripten.

How?

Analyze & diagnose existing native code bases
    • Summarize portability guidelines compliant with Web standards
    • Identify code style implementations that may affect performance of cross-compiled code

Resulting in:
  • Systematic approach for creating portable applications
    • ConConJS 
      • Native Client / Emscripten build system




What is ConConJS?




ConConJS (IPA: /kənkəndʒeɪz/) is a diagnosis tool for multi-platform graphics applications.

Serves as a proxy for cross-compilers such as Native Client and Emscripten

ConConJS Pipeline Overview

ConConJS Phases


The parser, evaluates code against a given set of specification databases

The data extractor, retrieves relevant information about portability and performance

The cross-compiler, which produces, when possible, a resulting Web application from the given code base

The Parser





Demo

The Extractor


 





Demo

The cross-compiler






Demo




Performance

CPU Language Benchmarks

 

MFLOPS

 

CPP

JS

Ratio

Convolution 1

911

34

3.7%

Recursive Filter 1

1074

929

86.5%

Fourier Transform 1

1186

75

6.3%

Sinc Interpolation 1

391

28

7.2%

Higher 'ratio' is better

Graphics Benchmarks


Vertices/sec

Slowdown

Native Code (GCC tool chain)

25679768

1

Web Application (ConConJS tool chain / Emscripten back-end)

11010000

2.33

Native vs. JavaScript without ConConJS
Higher verts/sec is better, lower slowdown is better

Graphics Benchmarks


Vertices/sec

Slowdown

Native Code (GCC tool chain)

25680258

1

Web Application (ConConJS tool chain / Emscripten back-end)

12845890

1.99

Native vs. JavaScript after ConConJS
higher verts/sec is better, lower slowdown is better

Future Work


  • ConConJS should be able to re-factor code (similar to what compiler optimizers attempt to do).

  • Add DirectX support.

Conclusion

  • The Web provides the means to distribute application to multiple platforms at the same time without friction for the consumers. 

  • Running native code in the Web if not the present it is the future for software distribution.

Analyzing and optimizing native code for portability to the Web using Emscripten, Native Client and ConConJS

By Raúl G. Roa Gómez

Analyzing and optimizing native code for portability to the Web using Emscripten, Native Client and ConConJS

  • 421