JVM

Java Conceptual Diagram

Overview of JVM architecture

JVM architecture

  • Operates on primitives and references
  • Fundamentally a 32-bit machine
    • long and double consume 2 units
    • smaller types like boolean, byte, short are based on sign-extended
    • char is zero-extended
    • boolean is operated on as 8-bit byte values, with 0 representing false and 1 represents true
  • A garbage-collected heap for storing objects and arrays
  • Code, constants, and other class data are stored in the "method area" (logically part of the heap, not garbage collect it)
  • JVM is a stack machine as well as a register machine.

Implementations of JVM

  • JVM specification doesn't limit way of implementations
    • Translating bytecode instructions to another VM instructions while load or execution
    • Translating bytecode instructions to local CPU instructions (JIT)

JVM Memory Concepts

JVM data areas

Program Counter (PC) register

  • JVM can support many threads of execution
  • Each thread has its own pc (program counter) register
  • At any point, each thread is executing the code of a single method, namely the current method for that thread
  • If that method is not native, the pc register contains the address of the JVM instruction currently being executed
  •  If the method currently being executed by the thread is native, the value of the JVM's pc register is undefined
  • The JVM's pc register is wide enough to hold a returnAddress or a native pointer on the specific platform (No OutOfMemoryError)

JVM stack

  • Each thread has its own JVM stack shared with same lifecycle
  • A stack stores frames, holds local variables and partial results, and plays a part in method invocation and return
  • Never manipulated directly except to push and pop frames, frames may be heap allocated
  • Does not need to be contiguous
  • If the computation in a thread requires a larger stack than is permitted, throws a StackOverflowError
  • If a stack can be dynamically expanded, and expansion is attempted but insufficient memory can be made available to effect the expansion, or if insufficient memory can be made available to create the initial stack for a new thread, throws an OutOfMemoryError

Native method stack

  • Each thread has its own native stack shared with same lifecycle (No if JVM does not support JNI)
  • Native method stack could have same implementation like JVM stack, or just merged with JVM stack together (HotSpot)
  • Throws same exception with JVM stack

Heap

  • Shared among all JVM threads.
  • The run-time data area from which memory for all class instances and arrays is allocated
  • Created on virtual machine start-up
  • Storage for objects is reclaimed by an automatic storage management system (known as a garbage collector), objects are never explicitly deallocated
  • JVM doesn't ask for any particular storage management implementation, could be either fixed or dynamic size, set via -Xmx and -Xms
  • If a computation requires more heap than can be made available by the automatic storage management system, the JVM throws an OutOfMemoryError

Method area

  • Shared among all JVM threads
  • Created on virtual machine start-up
  • For compiled code of a conventional language or the "text" segment in an operating system process
  • Stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and interface initialization and in instance initialization
  • If memory in the method area cannot be made available to satisfy an allocation request, throws an OutOfMemoryError

Run-Time constant pool

  • Each Run-time constant pool is allocated from the method area
  • A per-class or per-interface run-time representation of the constant_pool table in a class file
  • Constructed when the class or interface is created
  • Contains several kinds of constants, ranging from numeric literals known at compile-time to method and field references that must be resolved at run-time
  • String#intern allows to operate on the pool at run-time
  • When creating a class or interface, if the run-time constant pool requires more memory than can be made available in the method area of the JVM, throws an OutOfMemoryError

Direct memory

  • Does not actually locate in  JVM data area
  • The NIO classes added support for direct ByteBuffers, which can be passed directly to native memory rather than Java heap. Making them significantly faster in some scenarios because they can avoid copying data between Java heap and native heap
  • I/O buffers can also be backed by Java heap. A non-direct ByteBuffer holds its data in a byte[] array on the Java heap
  • Application still uses an object on the Java heap to orchestrate I/O operations, but the buffer that holds the data is held in native memory, the Java heap object only contains a reference to the native heap buffer

Example: HotSpot

  • Object creation
    • Find symbol after new operator
    • Load class if symbol not loaded
    • Allocate memory to object
    • Init object memory with zero
    • Create object header
    • Call init

Example: HotSpot

  • Object memory
    • Header: mark word, type pointer
    • Instance data: all fields data from both ancestors and self
    • Padding: keep object memory in 8 bytes alignment

Example: HotSpot

  • Object access locate

Useful JVM arguments

  • Heap size
    • -Xms20m -Xmx20m
  • Heap dump
    • -XX:+HeapDumpOnOutOfMemoryError
  • JVM stack and Native method stack
    • -Xss (for JVM stack)
    • -Xoss (not work for HotSpot)
  • Method area and run-time constant pool
    • --XX:PermSize --XX:MaxPermSize
  • Direct Memory
    • --XX:MaxDirectMemorySize (by default equals -Xmx)

Garbage Collectors

Major Topics of GC

  • Who should be GC?
  • When should GC happen?
  • How to do GC?

Who?

  • Java Heap
  • Identify dead object
    • Reference counting, cyclic reference is a problem (Python, Ruby)
    • Reachability analysis (used by Java, C#)
      • GC Roots can be objects referenced by JVM stack, static or constant attributes in method area, and Native stack

Reference Types

  • Strong, never being GC
  • Soft, only GC when is about to throw OutOfMemoryError
  • Weak, will be next GC
  • Phantom, ony used for notifications when GC happen

Finalizing object

  • Two passes object GC
    • Pass 1: put obj in F-Queue if finalize method is present, the method will be ran by internal low priority Finalizer thread
    • Pass 2: decide if object is required to GC after finalizer run

GC in Method area object

  • GC required fields in method area
    • Abandoned constants
    • Useless classes
      • All instances have been GC
      • ClassLoader has been GC
      • java.lang.Class never referenced
  • JVM arguments
    • Turn off class gc
      • -Xnoclassgc
    • Show class loading and unloading
      • -verbose:class
      • -XX:+TraceClassLoading -XX:+TraceClassUnloading

When? (GC algorithms)

  • Mark-Sweep
    • Mark objects need to GC
    • Sweep marked objects
    • Drawbacks
      • Low efficient
      • Cause too many memory fragments

When? (GC algorithms)

  • Copying
    • Avoid memory fragments issue
    • Drawback
      • Equal splitting is wasteful in most cases
      • HotSpot uses 8 : 1
      • Copy is cost for a large survivor size

When? (GC algorithms)

  • Mark-Compact
    • Avoid memory fragments issue
    • Move all survivors to one end
    • Sweep non-survivors space

When? (GC algorithms)

  • Generational Collection
    • Use Copying for young gen
      • Less survivors
    • Use Mark-Sweep or Mark-Compact for old gen
      • More survivors

Example: HotSpot

  • Enumerate GC roots
    • Cause GC stop-the-world
    • Check OopMap
  • Safepoint / Safe Region
    • Will launch GC only at safepoint
    • Method call, loop jump, and exception jump
    • Preemptive suspension
      • GC stop-the-world, then control each thread to reach safepoint
    • Voluntary suspension
      • GC add marks, then thread itself check the mark at safepoint to stop-the-world

Example: HotSpot Garbage Collector

  • Available Collectors
    • Serial Collector
    • Parallel Collector
    • The Mostly Concurrent Collectors

Example: Serial Collector

  • Earliest collector for young generation using copying
  • Single thread that requires stop-the-world
  • Currently is default collector for young generation under client mode
  • Pros: simple and efficient, good choice for desktop app
  • Cons: long pause to application threads

Example: ParNew Collector

  • Multi-threads version of Serial collector using copying
  • Sharing same params with Serial collector, including -XX:SurvivorRatio, -XX:PretenureSizeThreshold, -XX:HandlePromotionFailure, etc
  • Can be enabled by -XX:+UseConcMarkSweepGC or -XX:+UseParNewGC
  • By default will create counts same with CPU cores, can be configured by -XX:ParallelGCThreads
  • Pros: Leverage multiple CPU environment, good for server app
  • Cons: Not as efficient as Serial under client mode

Example: Parallel Scanvange Collector

  • Multi-threads collector for young generation using copying
  • Aims at controllable throughput = user cpu time / (user cpu time + gc cpu time)
    • -XX:MaxGCPauseMillis
    • -XX:GCTimeRatio (> 0 and < 100)
    • -XX:+UseAdaptiveSizePolicy enables GC Ergonomics, can automatically setup -Xmn, -XX:-SurvivorRatio, -XX:PretenureSizeThreshold

Example: Serial Old Collector

  • Serial collector for tenured generation using mark-compact
  • Currently is default collector for tenured generation under client mode

Example: Parallel Old Collector

  • Parallel scavenge collector for tenured generation using multi-thread and mark-compact
  • Replacement for Serial Old to work with Parallel Scavenge since JDK 1.6

Example: Concurrent Mark Sweep (CMS)

  • Aims at shortest pause time since JDK 1.5 for tenured generation
  • Initial mark: mark objects directly related to GC roots
  • Concurrent mark: GC roots tracing
  • Remark: mark objects that changes during concurrent mark
  • Concurrent sweep

Example: Concurrent Mark Sweep (CMS)

  • Pros: Low pause GC in multiple CPU environment
  • Cons: 
    • Low throughput in small CPU count environment, default created threads count: (CPU count +3) / 4
    • May cause "Concurrent Mode Failure" if too many created objects cannot be satisfied during concurrent mark, eventually will cause Serial Old full GC process
    • -XX:CMSInitiatingOccupancyFraction can be used to adjust CMS threshold, which is by default 92% in JDK 1.6
    • Mark-sweep may cause too many memory fragment eventually cause full GC
    • -XX:+UseCMSCompactAtFullCollection, turning on by default
    • -XX:+CMSFullGCsBeforeCompaction, run mark-compact after X times full GC, by default is 0

Example: G1 collector (Garbage first)

  • Expected replacement for CMS in server mode since JDK 1.7
  • Core improvement to CMS
    • Use mark-compact and copying
    • Expectable low pause model: N / M
      • N: Max GC time
      • M: total time
    • While keeping generation concept, G1 splits java heap into multiple equal sized region, and continuously tracing or GC most valuable region based on configured N / M value
    • A remembered set is created for each region, to avoid full heap scan during marking process

Example: G1 collector (Garbage first)

  • Steps
    • Initial Marking: same to CMS, and change TAMS to guide memory allocation for following user threads
    • Concurrent Marking: same to CMS
    • Final Marking: same to CMS, but only merge remembered set logs into remembered set to avoid full heap scan
    • Live Data Counting and Evacuation: sort all regions by collecting value and cost, GC most valuable regions first 

Example: G1 collector (Garbage first)

  • Pros
    • Must better Soft Real-Time Goal (SRTG) measurement than CMS: lower pause time statistically
  • Cons
    • A bit lower throughput (3%~10%) than CMS

Example: Z collector

GC Log Analysis

  • Turn on GC log
    • -XX:+PrintGCDetails
    • -XX:+PrintGCDateStamps: show date instead of running time from jvm started
    • -Xloggc:<file-path>

GC Parameters

Memory Allocation and Collecting Strategy

Related Topics

  • Java heap allocation
    • may be allocated in stack after JIT
    • mainly in Eden
    • may be allocated in TLAB if local thread allocation buffer is enabled
    • may be allocated in tenured generation
  • Depends on used JVM garbage collectors and memory options

Eden First

  • By default object will be allocated to Eden in most cases
  • If Eden has no enough space, minor GC will run
  • Minor GC: GC in young generation
  • Major / Full GC: GC in tenured generation (10X slower than minor GC)

Tenured Gerneration for Large Object

  • Large sized, continued object will be allocated to tenured generation (Like large string or array)
  • Pros:
    • Avoid many copings between Eden and Survivor in young generation
  • Cons:
    • Too many large object may cause frequent full GC
  • -XX:PretenureSizeThreshold can set the threshold, default is 0
    • Only work for Serial and ParNew

Long Term Survivors Regarded Tenured

  • Surviving count is equal or larger than 15
  • -XX:MaxTenuringThreshold can set the threshold
  • Dynamic object age determination: if all objects with same age has more than half size of Survivor, then objects elder than that age will be regarded tenured

Promotion handling

  • Process
    • Before minor GC, available continuing space in tenured generation (ACSTG) will be checked and compared total size in young generation, do Minor GC if available
    • Check HandlePromotionFailure (always true after JDK 6u24) to see if failure is allowed, if true following check ACSTG and compared to average objects size existed in tenured generation, try minor GC if available, otherwise do full GC
    • If handle promotion failure happens in minor GC, do full GC
  • Pros
    • This is to try best avoid full GC
  • Cons
    • Full GC may still run when HPF happens 

Available Collectors of recent JVMs

  • Serial Collector (since beginning)
    • For client mode, java -XX:+UseSerialGC -jar Application.java
  • Parallel Collector (default for jdk 8)
    • For server mode, java -XX:+UseParallelGC -jar Application.java
  • CMS Collector (since jdk 5)
    • For server mode, java -XX:+UseParNewGC -jar Application.java
  • G1 Collector (since jdk 7u4, default for jdk 9, 10, 11)
    • For server mode, java -XX:+UseG1GC -jar Application.java
  • Shared String: -XX:+UseStringDeduplication (since jdk 8u20)
  • Z Collector (experiment in jdk 11)
    • For Linux/x64 in jdk 11
    • For low latency and/or very large heap (multi-terabytes)
    • -XX:+UnlockExperimentalVMOptions -XX:+UseZGC

Class (bytecode) File

Structure of Class File

  • Not all classes and interfaces are required in Class file, they can also come from class loader, therefore Class file doesn't have to be a file in disk
  • A binary stream in byte (8 bits), sequenced in big-endian
  • Basic data types
    • Unsigned number: to represent number, reference, char, etc
    • List: to represent hierarchical structured data, with "_info" as suffix
  • A Class file can be regarded as a list data

Data type used by each section

Magic Number and Version

  • Magic Number (4 bytes) is used to identify a valid Class file
    • 0xCAFEBABE
  • Minor Version (2 bytes)
  • Major Version (2 bytes)
  • JVM will reject Class file with version later than itself

Constant Pool

  • Literal, including string, final constant, etc
  • Symbolic References
    • Fully qualified name of class or interface
    • Field name and descriptor
    • Method name and descriptor
  • Analyze Class file: javap -verbose

Access Flags

  • Identify class or interface level information
    • 2 bytes
    • if public
    • if abstract
    • if final

This class, Super class, and Interfaces

  • This class
    • Full qualified name of current class
  • Super class
    • Full qualified name of super class of current class
  • Interfaces
    • All extended or implemented interfaces

Fields

  • Class or Instance level fields in class or interface
    • if public, private, protected
    • if static
    • if final
    • if volatile
    • if transient
    • data type (refers to constant pool)
    • field name (refers to constant pool)

Methods

  • Class or Instance level methods in class or interface
    • if public, private, protected
    • if static
    • if final
    • if synchronized
    • if strictfp
    • if abstract
    • if synthetic
    • if bridge
    • method name (refers to constant pool)
    • attributes

Attributes

  • Attributes used in Class, Fields, and Methods

Introduction to bytecode instructions

  • 1 byte opcode + multi-operands
  • Maximum 256 opcodes
  • Supported data types
    • byte: b
    • short: s
    • int: i
    • long: l
    • float: f
    • double: d
    • char: c
    • reference: a

Introduction to bytecode instructions

  • Instruction Set
    • Load and Store
    • Arithmetic
    • Type Conversion
    • Object creation and Access
    • Operands stack management
    • Control
    • Invoke and Return
    • Exception handling
    • Synchronization

Class Load

Lifecycle of Class

  • Loading, fixed order
  • Verification, fixed order
  • Preparation, fixed order
  • Resolution, may be before or after initialization to support dynamic binding (run-time binding)
  • Initialization, fixed order
    • static block will run
  • Using
  • Unloading, fixed order

Preconditions of Class initialization

  • Proactive Invoke, will initialize Class in below cases if not before
    • Found new, getstatic, putstatic, invokestatic in bytecode
      • new object
      • read or write a non-final static field in an object (child will not count if the field is in parent class)
      • invoke static method
    • Invoke Class by api in java.lang.reflect
    • Parent Class should be initialized before children, not for interface
    • Class including main method
    • Invoke api in java.lang.invoke.MethodHandle
  • Reactive Invoke, will not cause initialization
  • -XX:+TraceClassLoading

Process of Class Loading

  • Loading
    • Get binary bytes stream of a Class via full qualified name
    • Convert static storage structure into run-time data structure in method area
    • Generate an object of java.lang.Class of the Class, as the entry of all kinds of Class data in method area
  • Example
    • Load from ZIP package, like jar, ear, war, etc
    • Loading from network, like applet
    • Generate and load dynamically, like java.lang.reflect
    • Load from JSP
    • Load from database

Process of Class Loading

  • Verification
    • First step of linking, to prevent JVM from attacking
    • Process
      • File format verify
      • Metadata verify
      • Bytecode verify
      • Symbol reference verify
  • -Xverify:none to turn off most of verification

Process of Class Loading

  • Preparing
    • Allocate memory to class variable and assign initial value in method area
      • only static variable
      • If the field is non final, assigned with "zero value" of the data type rather than initial value in code, which will be ran during initialization stage

Process of Class Loading

  • Resolution
    • Replace symbolic reference in constant pool with direct reference
      • Symbolic reference, symbole can be in any form of literals defined in Class file, may not be in memory
      • Direct reference, can be pointers, relative value, or handles, must be in memory
    • invokedynamic will cause resolution dynamically for dynamic typing languages (not for java)
    • Mainly for class, interface, field, class method, interface method, method type, method handle, and call site specifier 

Process of Class Loading

  • Exceptions may happen during Resolution
    • Class or interface
      • java.lang.IllegalAccessError
    • Field
      • java.lang.IllegalAccessError
      • java.lang.NoSuchFieldError
    • Class method
      • java.lang.IncompatibleClassChangeError
      • java.lang.AbstractMethodError
      • java.lang.NoSuchMethodError
    • Interface method
      • java.lang.IncompatibleClassChangeError
      • java.lang.NoSuchMethodError
      • java.lang.IllegalAccessError

Process of Class Loading

  • Initialization
    • Initialize class variable and other resources via run <clinit>()
    • <clinit>() includs all class variable assignments, static block code, and following the order they present in code. In static block can only assign to the class variable if it's defined after the block, but cannot access to it
    • The difference to instance <init>() is that don't need to explicitly call constructor in parent class, because JVM will ensure parent class is initialized already. Hence the first call <clinit>() must be java.lang.Object
    • No <clinit>() if no static block and class variable assignment in class
    • In interface don't need to first initialize parent interface, unless some variables in parent are called
    • JVM ensures <clinit>() is locked and synchronized in multi-thread environment

Class Loader

  • A class is only unique under same class loader, via equals(), isAssignableFrom() and isInstance() methods, and instanceof
  • Bootstrap ClassLoader implemented in C++
    • Load <JAVA_HOME>\lib, -Xbootclasspath, rt.jar, etc
  • Other ClassLoaders implemented in java.lang.ClassLoader
    • Extension ClassLoader
      • Load <JAVA_HOME>\lib\ext or java.ext.dirs
    • Application ClassLoader
      • return by ClassLoader.getSystemClassLoader()
      • Load ClassPath

Class Loader

  • Parents delegation model
    • Always try parent class loader first using composition
    • Ensure the stability of JVM

Class Loader

  • Customized ClassLoader
    • Ensure Parents delegation model
      • Override findClass() in java.lang.ClassLoader rather than loadClass()
    • Thread Context ClassLoader
      • setContextClassLoader in java.lang.Thread
    • OSGi

Class Execution

Run-time Stack Frame Structure

  • The stack frame top is called "Current Stack Frame"
    • The method is called "Current Method"
  • Size of stack frame is determined during compilation

Run-time Stack Frame Structure

  • Local Variable Table (Array)
    • Keep method parameters and local variables
    • Size is determined during compilation
  • LVA is organized by Variable Slot
    • Slot can be usable
    • If a slot hasn't been reused by other variable, original variable may not disconnect to GC Roots, which means GC will not work for original variable until the slot is took by others

Run-time Stack Frame Structure

  • Operand Stack
    • Last In First Out
    • Element can be any Java data type, including long and double
    • Operand stack is used by arithmetic expression or pass parameters to other methods
  • This is why JVM is based on stack rather than register

Run-time Stack Frame Structure

  • Dynamic Linking
    • Each stack frame is connecting to the reference of their method in run-time constant pool
    • Conversion from symbolic reference to direct reference
      • Static linking, during class load or first time used
      • Dynamic linking, during run-time

Run-time Stack Frame Structure

  • Return address
    • Normal method invocation completion: when return statement is met, return value and type will be determined by return opscode
      • Return to PC value of caller, which can also be found in current stack frame
    • Abrupt method invocation completion: when throw or athrow in bytecode is met, and no exception handler is met in exception array, no return value in this case
      • Return address will be determined by exception handling array

Process of Method Call

  • Method call can be very complicated, as the symbolic reference can only be coverted to direct reference during class loading or run-time
  • The benefit of this complexity is powerful extendability
  • Basic concepts
    • Resolution
    • Dispatch
    • Dynamic type language support

Process of Method Call

  • Resolution
    • By default all callee methods are symbolic references in constant pool
    • Some of callees are converted into direct reference during resolution of class loading, if method indeed exists before program runs and it's immutable, including all non-virtual methods:
      • Static method
      • Private method
      • Instance constructor
      • Parent method
      • Final method

Process of Method Call

  • Dispatch
    • Method overload resolution (static dispatch)
      • Based on static type (may not strictly match), which happens during compilation stage
      • Type conversion > Auto boxing > Parent class > Varargs 
    • Example
      • Suppose class Man extends class Human
      • For Human man = new Man()
      • Human is called "Static type" or "Apparent type" of variable man, Man is called "Actual type"
      • Static type is known during compilation, not for actual type
      • Overload is compilation feature, which relies on static type over actual type

Process of Method Call

  • Dispatch
    • Dynamic dispatch
      • Based on actual type
      • Override
  • Single Dispatch and Multiple Dispatch
    • Method callee and arguments
    • Dynamic multiple dispatch: override (rely on callee)
    • Static single dispatch: overload (rely on arguments)

Process of Method Call

  • Implementation of dynamic dispatch (override)
    • Virtual method table or interface method table
    • Initialized during Linking stage of class loading, after class variable value is prepared

Dynamically Typed Language Support

  • Dynamical typed language
    • Type checking happens during run-time rather than compile time
  • Originally is intend for lambda expression, extend to other JVM languages support later on
  • java.lang.invoke since JDK 7
    • JVM level implementation
    • Lighter than Reflection​
    • invokedynamic in JVM no longer keeps symbolic reference for original method, instead uses new constant including:
      • ​Bootstrap method
      • MethodType
      • Method name

Execution of Bytecode in Method

  • Instruction Set Architecture
    • Stack based
      • Simple, no dependency on hardware, lower in performance
      • JVM
    • Register based
      • Complex, depend directly on hardware, higher in performance
      • Most physical CPU, and Android Dalvik VM
  • View bytecode
    • javap

JVM

By hanyi8000

JVM

  • 2,173