A Brief Introduction to Systems Programming,
with Scala Native
@richardwhaling
Nov 14 2019
About me:
- @RichardWhaling
- Lead Data Engineer at M1Finance
- Author of:
About this talk
- Scala Native has moved forward a lot in five months
- I'll give a quick status update on Scala Native and concurrency
- Context & background on Scala Native and systems programming more generally
- Dive deep into example programs that show how Scala Native exposes the fundamental techniques of systems programming
What's New in Scala Native
- Current Release is 0.4.0M2
- M3 coming soon with 2.12/2.13 support
- 0.4.0 to follow
- Released official LibUV bindings, scala-native-loop
- Aiming for Cats and Zio support for 0.4
- Designed for compatability with existing libraries
- Targeting FP frameworks lets us bypass Java IO
What Is Systems Programming?
"the domain of programs that demand a mental model of the computer as a machine"
- Operating Systems, IO, Compilers, VM's, Containers, Embedded, Real-time
- Traditionally done in C
- Traditionally taught in school and promptly forgotten
It doesn't have to be this way.
Systems programming can be elegant, fun, and done in a language you enjoy.
Why C?
- Inertia?
- OS and hardware vendors...
- Many recent arguments against this:
- Steve Klabnik: https://tinyurl.com/y7akng69
- David Chisnall: https://tinyurl.com/yxpapq3g
- C describes the behavior of an abstract machine
- Modern processors are very different from a PDP-11
My hot take:
from learning C, I acquired an intuitive understanding of how to solve problems in an abstract von Neumann machine
John von Neumann
(1903-1957)
- Incredibly accomplished mathematician and physicist
- Inventor of mergesort
- Described the architecture of modern computers in his
First Draft of a Report on the EDVAC
(1945)
EDVAC
Electronic Discrete Variable Automatic Computer
- Proposed in 1944, operational in 1949
- Designed by John Mauchly and J. Presper Eckert
- 1000 34-bit words of ultrasonic mercury memory
EDVAC was the first stored-program computer, which stored data and code in byte-addressable memory.
Earlier computers like ENIAC and Colossus were programmed by patch cables and switches, which was theoretically Turing-complete, but impractical to program.
Von Neumann Architecture
Theoretical description of a realized Universal Turing Machine, i.e., a general-purpose computer
Unlike a Universal Turing Machines, Von Neumann machines were practical to construct and program
1944-1951
In 7 years, the first computer scientists invented:
- electronic random-access memory
- conditional branches
- goto instructions
- subroutine invocation and return
- mergesort
- two's complement integers
- Monte Carlo methods
- computer music
- computer games
An explosion of applications and discoveries enabled by a comprehensible, practical model of a programmable general-purpose computer
1972: C
C presents an enduring abstract model
of a random-access stored-program computer, with:
- primitive data types: bytes, ints, floats
- zero-terminated variable-length byte strings
- arrays
- structs (i.e., product types)
- unions (i.e., sum types)
- pointers
- function pointers (i.e., functions as values)
Hot Take:
these are the fundamental techniques
of programming a Von Neumann machine
2017: Scala Native
Scala Native is a scalac compiler plugin that compiles Scala programs to binary executables ahead-of-time
Noteworthy for: its advanced optimizer, lightweight runtime, advanced GC, and C interop
Not a JVM - Graal compiles JVM bytecode to machine binary, very different model
Because it understands Scala, Scala Native can provide an elegant DSL for low-level programming
with all the capabilities of C
Systems Programming in Scala Native
We're going to illustrate the fundamental techniques:
- Primitive Data Types
- Pointers
- Strings
- Arrays
- Structs
- Unions and "type puns"
- Functions
Each with a short program of less than 20 lines of code
Systems Programming in Scala Native
Caveat:
Regular Scala works just fine in Scala Native.
All the features you'll see here belong to the scalanative.unsafe API
The slides that follow will contain extremely unindiomatic, imperative Scala
1. Primitive Data Types
val i:Int = 6
println(s"Int i has value ${i} and size ${sizeof[Int]} bytes")
val b:Byte = 4
println(s"Byte b has value ${b} and size ${sizeof[Byte]} bytes")
val d:Double = 1.0
println(s"Double d has value ${d} and size ${sizeof[Double]} bytes")
- Certain data types are fundamental
- These types have concrete representations and fixed sizes
- Bool, Byte, Int, Long, Float, Double
- 1-8 bytes
- Strings are not a primitive in C
2. Pointers
val jPtr:Ptr[Int] = stackalloc[Int]
println(s"jPtr has value ${jPtr} and size ${sizeof[Ptr[Int]]} bytes")
val j:Int = !jPtr
println(s"j has value ${j} and size ${sizeof[Int]}")
!jPtr = 5
println(s"jPtr has value ${jPtr} and size ${sizeof[Ptr[Int]]} bytes")
val j2:Int = !jPtr
println(s"j2 has value ${j2} and size ${sizeof[Int]}, j has value ${j}")
- A pointer denotes the address of a value in memory
- Generally a 64-bit unsigned integer under the hood
- Pointer values are created by explicit allocation
- Pointers are read and updated with the dereference operator "!"
- No addressOf/& operator
- Better safety - can't break the seal on GC managed objects
- Semantics related to reference and pointer types in Haskell/SML
4 Rules
- Every piece of data lives somewhere in memory
- Every piece of data has some fixed size
- Some objects are managed (but they still live somewhere)
- All manipulations of addresses and sizes are simple arithmetic
3. Arrays
val arraySize = 16 * sizeof[Int]
val allocation:Ptr[Byte] = stdlib.malloc(arraySize)
val intArray = allocation.asInstanceOf[Ptr[Int]]
for (i <- 0 to 16) {
intArray(i) = i * 2
}
for (i <- 0 to 16) {
val address = intArray + i
val item = intArray(i)
val check = !(intArray + i) == intArray(i)
println(s"item $i at address ${intArray + i} has value $item, check: $check")
}
// just to be safe
stdlib.free(allocation)
- Arrays are really just pointers with arithmetic operators
- Access by index is equivalent to addition and dereference
- Address is incremented by offset times element size
- Seeks are constant time because layout is uniform
4. Strings
val hello:CString = c"hello, world"
val helloLen = string.strlen(hello)
val helloString:String = fromCString(hello)
println(s"the string ${helloString} at ${hello} is ${helloLen} bytes long")
println(s"the CString value 'str' is ${sizeof[CString]} bytes long")
for (offset <- 0L to helloLen) {
val chr:CChar = hello(offset)
println(s"${chr.toChar} (${chr}) at ${hello + offset} is ${sizeof[CChar]} bytes long")
}
- How do we handle sequential data of unknown size?
- Two techniques: terminating with 0 byte or storing length
- 0-terminated strings were probably a mistake
- CChar is an alias for Byte
- CString is an alias for Ptr[CChar]
- Runtime helps with allocation to convert to Scala string
- Moving to a safe representation ASAP is a huge safety win
4. Strings
- How do we handle sequential data of unknown size?
- Two techniques: terminating with 0 byte or storing length
- 0-terminated strings were probably a mistake
- CChar is an alias for Byte
- CString is an alias for Ptr[CChar]
- Runtime helps with allocation to convert to Scala string
- Moving to a safe representation ASAP is a huge safety win
+--------+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| Offset | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D |
+--------+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| Char | H | e | l | l | o | , | | w | o | r | l | d | ! | |
| Hex | 48 | 65 | 6C | 6C | 6F | 2C | 20 | 77 | 6F | 72 | 6C | 64 | 21 | 00 |
+--------+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
Recap
- Ptr[T] indicates the address of zero or more items of T
- Abstracts over Option, Seq, String-like capabilities
- Best thought of as a mutable container:
- Represents a capability to change remote data
- Unsafe! Segfaults, undefined behavior, etc.
5. Structs
type LabeledPoint = CStruct3[CString,Int,Int]
val point:Ptr[LabeledPoint] = stackalloc[LabeledPoint]
point._1 = c"foo"
point._2 = 3
point._3 = 5
println(s"struct field ${point.at1} has value ${point._1}")
println(s"struct field ${point.at2} has value ${point._2}")
println(s"struct field ${point.at3} has value ${point._3}")
println(s"struct ${point} has size ${sizeof[LabeledPoint]}")
- A Struct is a product type, like a case class or tuple
- Tuple-like behavior by default
- Fields are stored contiguously, address offset is known a priori
- ._1 etc retrieves field. .at1 returns address of field
- Syntactic sugar dereferences pointers to structs for convenience
- Almost always manipulated via a pointer
5. Structs
- A Struct is a product type, like a case class or tuple
- Tuple-like behavior by default
- Fields are stored contiguously, address offset is known a priori
- ._1 etc retrieves field. .at1 returns address of field
- Syntactic sugar dereferences pointers to structs for convenience
- Almost always manipulated via a pointer
+--------+----+----+----+----+----+----+----+----+
| Offset | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+--------+----+----+----+----+----+----+----+----+
| Value | 5 | 12 |
+--------+----+----+----+----+----+----+----+----+
| Hex | 05 | 00 | 00 | 00 | 0C | 00 | 00 | 00 |
+--------+----+----+----+----+----+----+----+----+
6. Unions
type LabeledFoo = CStruct2[Int,Int]
type LabeledBar = CStruct2[Int,Long]
val FOO = 0
val BAR = 1
println(s"LabeledFoo size is ${sizeof[LabeledFoo]}")
println(s"LabeledBar size is ${sizeof[LabeledBar]}")
val array = stdlib.malloc(8 * sizeof[LabeledBar]).asInstanceOf[Ptr[LabeledBar]]
for (i <- 0 until 8) {
array(i)._2 = 0
if (i % 2 == 0) {
val item = array(i).asInstanceOf[LabeledFoo]
item._1 = FOO
item._2 = Random.nextInt() % 16
} else {
val item = array(i).asInstanceOf[LabeledBar]
item._1 = BAR
item._2 = Random.nextLong % 64
}
}
for (j <- 0 until 8) {
val tag = array(j)._1
if (tag == FOO) {
val item = array(j).asInstanceOf[LabeledFoo]
println(s"Foo: ${tag} at $j = ${item._2}")
} else {
val item = array(j).asInstanceOf[LabeledBar]
println(s"Bar: ${tag} at $j = ${item._2}")
}
}
- A value that can be one or more types can be modeled as a sum type or union
- Idiomatically in C we do this with unsafe casts
- If two structs have a common prefix of fields, those fields may be safely used interchangeably
- In the "tagged union" pattern we use a prefix field to hold type metadata
+--------+----+----+----+----+----+----+----+----+
| Offset | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+--------+----+----+----+----+----+----+----+----+
| Value | 5 | 12 |
+--------+----+----+----+----+----+----+----+----+
| Hex | 05 | 00 | 00 | 00 | 0C | 00 | 00 | 00 |
+--------+----+----+----+----+----+----+----+----+
+--------+----+----+----+----+----+----+----+----+----+----+----+----+
| Offset | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B |
+--------+----+----+----+----+----+----+----+----+----+----+----+----+
| Value | 3 | 29 |
+--------+----+----+----+----+----+----+----+----+----+----+----+----+
| Hex | 03 | 00 | 00 | 00 | 1D | 00 | 00 | 00 | 00 | 00 | 00 | 00 |
+--------+----+----+----+----+----+----+----+----+----+----+----+----+
7. Function Pointers
type Comparator = CFuncPtr2[Ptr[Byte],Ptr[Byte],Int]
type Record = CStruct2[Int,Int]
val comp = new Comparator {
def apply(aPtr:Ptr[Byte], bPtr:Ptr[Byte]):Int = {
val a = !(aPtr.asInstanceOf[Ptr[Record]])
val b = !(bPtr.asInstanceOf[Ptr[Record]])
a._2 - b._2
}
}
val size = 8
val recordArray:Ptr[Record] = stdlib.malloc(8 * sizeof[Record]).asInstanceOf[Ptr[Record]]
for (i <- 0 until 8) {
recordArray(i)._1 = i
recordArray(i)._2 = Random.nextInt() % 256
}
stdlib.qsort(recordArray.asInstanceOf[Ptr[Byte]],8,sizeof[Record],comp)
for (i <- 0 until 8) {
val rec = recordArray(i)
println(s"${i}: random value ${rec._2} from original position ${rec._1}")
}
- Functions are values in C, but not the same as Scala functions
- A function has a fixed address and no lexical scope
- Function call is basically just argument marshalling and GOTO
- Much faster than method dispatch
- You can fake lexical scope by storing context in extra arguments
- No polymorphism, but you can pass Ptr[Byte] and cast
Mechanical Sympathy, Functional Affinity
- There is an affinity between systems and FP
- Functions as values, sum/product types
- Deep roots in Scala's heritage (cf Standard ML)
- Prior to Scala most functional languages had powerful low-level facilities (even Haskell!)
- I find Scala Native's unsafe API easier, safer, and more productive than writing C
- Working with the system directly feels, to me, more elegant than going through an OOP layer
- Folks have suggested porting the scalanative.unsafe API to the JVM and Graal via sun.misc.Unsafe
- A genuine breakthrough in ergonomics
Thanks!
Questions?
Thanks!
A Brief Introduction to Systems Programming with Scala Native
By Richard Whaling
A Brief Introduction to Systems Programming with Scala Native
- 795