Multi-Dimensional Climate Applications
George Kierstein
Challenges and Solutions
Common Approaches
One Dimensional Solutions
Vertical Stack
Single-Use
'Reports' not applications
Incommensurable Data
(Multi-Generational Data Sets)
Period of record longer than any one person's career
Format changes over time
Custom formats that are typically poorly documented.
Relationships between data sets opaque
Alternative Approach
Climate Applications will need to leverage:
Distributed architectures designed for end-user applications
Modern Data Visualization best practices
Multiple data sets
Modern Toolchains and languages
Multi-dimensional Climate Data visualization using Clojure/Clojurescript
What is Clojure?
(Don't worry this won't hurt a bit)
It's a pure functional LISP that runs on the JVM
'Pure Functional Language' ?
-
Basically it's just Math
- Lambda Calculus, Alonzo Church 1936
G( F(X) )
So What?
Garbage collection was invented by John McCarthy around 1959 to abstract away manual memory management in Lisp
The REPL was created by a company called Lisp Machines in the 70s
It's Expressive:
Code Survivability
The Imperative Model is breaking now more than ever:
- Even an older-model laptop has multiple cores, memory caches, optimization strategies, etc
- Distributed computing is a hard break
- Quantum computing is based on *photons* and right around the corner
- we can't stop ourselves
Code Survivability
LISPS
- Foundations in Mathematics
- Even if the JVM went poof
LISP's are an excellent fit for scientific computing and, perhaps, best fitted for generational-scale code survivability
The Data
CRN
SWDI - Hail
One Library of Note:
Mathbox
Steven Wittens
http://acko.net
(Screenshot)
Mathbox Usage
(Screenshot)
(Screenshot)
Dynamic Subsetting
Finding correlated subsets can be challenging
Typically done by custom code
Correlations calculated by hand
Adding new dimensions time-consuming and require domain expert
Testing and maintenance in production challenging
System: Distributed Logic Subsetting Engine
Declarative
Data set relationships described *once*
Logic-Solver finds the data you want on demand.
Batch/Stream engine fits modern end-user application pipelines
(Lambda Architectures)
(Advantages)
System: Distributed Logic Subsetting Engine
Clojure application using:
Onyx (declarative stream/batch pipeline)
Custom logic-solver as plugin
Proprietary equivalence engine for correlations between data sets
(Implementation)
Many Thanks!
GST Big Data Presentation
By gatewayspectacle
GST Big Data Presentation
- 288