Research Proposal
by Dmitry Duplyakin
Managing and Optimizing Experiments
on Cloud Computing Systems
PhD Student, University of Colorado
10/21/2016
Committee:
Jed Brown, Robert Ricci, Ken Anderson, Shivakant Mishra, Rick Han
Outline
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Different Ends of the Computing Spectrum
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
| Large-Scale Computing | Small-Scale Computing | |
|---|---|---|
| Cost | Highly subsidized | Unsubsidized |
| Resources | Dedicated machines | Limited allocations |
| Hardware | Redundant, special-purpose | Dual-purposed, commodity |
| Analysis | Demonstrate computing at the largest possible scale |
Obtain the most knowledge out of available cycles |
Thesis Statement and Research Objectives
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Cloud Computing offers mechanisms for utilizing on-demand computing resources. Coupled with sophisticated software tools built upon open-source projects, such resources can be efficiently managed and utilized according to user and application requirements.
Thesis Statement:
Research Objectives:
"Flavors" of Computing and Infrastructure
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Technology: Overview
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Rebalancing [2013]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
| Paper: | Rebalancing in a Multi-Cloud Environment |
| Presented at: |
4th Workshop on Scientific Cloud Computing (ScienceCloud) 2013
Co-located with ACM HPDC 2015, New York, NY, USA, 2013 |
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Environments that consist of compute resources provisioned at multiple clouds may need to be periodically rebalanced: some resources need to be terminated and replaced with different ones in order to best satisfy current user needs.
Automatic rebalancing is a non-trivial process.
Problem Statement:
Rebalancing [2013]
Metrics of interest:
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Rebalancing [2013] - Opportunistic Policy
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Rebalancing [2013] - Force Offline
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Rebalancing [2013] - Tradeoffs
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Rebalancing [2013] - Summary
Architecting [2015]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
| Paper: | Architecting a Persistent and Reliable Configuration Management System |
| Presented at: |
6th Workshop on Scientific Cloud Computing (ScienceCloud) 2015
Co-located with ACM HPDC 2015, Portland, OR, USA, 2015 |
| Poster: |
Highly Available Cloud-Based Cluster Management
at 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2015 |
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Configuration management systems must remain operational in the majority of failure scenarios since they are the systems on which system administrators rely in performing recovery actions.
Problem Statement:
Proposal: investigate use of clouds
Architecting [2015]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Developed prototype:
Architecting [2015]
Leveraged Technologies:
Configuration Management [2016]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
| Paper: | Introducing Configuration Management Capabilities into CloudLab Experiments |
| Presented at: |
The International Workshop on Computer and Network Experimental Research Using Testbeds (CNERT)
Co-located with IEEE INFOCOM, San Francisco, CO, USA, 2016 Received the Best Paper Award |
Configuration Management [2016]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Testbeds: provide isolated and recreatable environments
Configuration Management [2016]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Experiments: require building software environments
Configuration Management [2016]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Experiments: require building software environments... on many nodes
Configuration Management [2016]: Common Workflows
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Configuration Management [2016]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Research Proposal, Fall 2016
Problem: Explosion of Configurations!
(snapshot- and simple script-based approaches don’t scale)
Dmitry Duplyakin, University of Colorado
Configuration Management [2016]
Research Proposal, Fall 2016
Dmitry Duplyakin, University of Colorado
Configuration Management [2016]: Summary
Active Learning [2016]
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
| Paper: | Active Learning in Performance Analysis |
| Presented at: | IEEE Cluster 2016, Taipei, Taiwan, 2016 |
Motivation
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Motivation: Example 1
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Each point represents a run of HPGMG-FE benchmark on a 4-node cluster provisioned on CloudLab testbed
Motivation: Example 2
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Each point represents a run of HPGMG-FE benchmark on a 4-node cluster provisioned on CloudLab testbed
Approach: Active Learning
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Approach: Gaussian Process Regression
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Approach: Putting it Together
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Upper: AL
Lower: GPR
Upper: Choose "best" experiment
Lower: Choose "best" hyperparameters
Approach: Details
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Upper: Choose "best" experiment
Lower: Choose "best" hyperparameters
Consider strategies:
Variance Reduction (VR):
Cost Efficiency (CE):
Use: Bayesian Model Selection
(Marginal Likelihood Maximization)
with 3 hyperparameters:
noise level, length scale, and amplitude
Implementation
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
* Note: offline refers to the fact that the prototype queries a database with collected data. Future work: in online mode, run AL alongside the computation
Analyzed Datasets
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Active Learning: 10 Iterations
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Shown points represent a subset of measurements in the Performance dataset; runtimes are log-transformed
Active Learning: 100 Iterations
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Shown points represent a subset of measurements in the Performance dataset; runtimes are log-transformed
Evaluation: Convergence Analysis
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Evaluation: Cost Analysis
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Active Learning [2016]: Summary
Dmitry Duplyakin, University of Colorado
Active Learning in Performance Analysis
09/14/2016
Technology: Recap
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Technology: Future Work
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Future Work: Cooperative Scheduling
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Context: two systems with different loads (clusters, clouds, testbeds,...)
Goal: they "exchange" resources when possible
Proposed interface: exchange preemption vectors
Cooperative Scheduling: Prototype
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
(elastic experiment on Apt)
Cooperative Scheduling: Summary
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Potential Targets:
Future Work: AL and AMR
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Adaptive Mesh Refinement (AMR):
Source: http://math.boisestate.edu/~calhoun/www_personal/research/amr_software
Proposal: apply AL to AMR problems
Common Approaches to AMR
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Tree-based
Block structured
(patch-based)
Source: http://www.training.prace-ri.eu/uploads/tx_pracetmo/AMRIntroHNDSCi15.pdf
Examples of AMR Software
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Sources:
http://math.boisestate.edu/~calhoun/www_personal/research/amr_software/
https://www.youtube.com/watch?v=DKn9iuD7Ihk
https://arxiv.org/pdf/1308.1472v1.pdf
Clawpack and GeoClaw (based on AMRClaw)
p4est and ForestClaw
AL and AMR
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Proposal: apply AL to AMR problems
AL and AMR: Proposed Study
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Potential Targets:
Proposed Timeline
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Summary
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Other Contributions
Dmitry Duplyakin, University of Colorado
Research Proposal, Fall 2016
Thank you!
Questions?