Frans Ojala: frans.ojala@helsinki.fi
Helsinki University
A standard or point of reference against which things may be compared.
A problem designed to evaluate the performance of a computer system.
Benchmark
- Oxford dictionary
Benchmark
Premise 1: we need compute power, storage, network bandwidth etc. and it costs.
- How can we get the best system for the capital we have?
Premise 2: different vendors have different systems that have different properties.
- How can we compare the systems?
- CPU speed
- Disk I/O
- Memory I/O
- Network speed, latency
- Database throughput
- Application throughput
- Holistic system performance on load
Benchmark
Easy! Build an application that mimics the desired load and build a scoreboard!
Benchmark
Introduction and motivation
A brief history of benchmarking
Case example in Cloud benchmarking: BigBench
Case example in ranking service providers: SMICloud
Summary of key concepts
Inudstry standards
Overview
Ghazal, Ahmad, et al. "BigBench: towards an industry standard benchmark for big data analytics." Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 2013.
Rabl, Tilmann, et al. "A data generator for cloud-scale benchmarking." Performance Evaluation, Measurement and Characterization of Complex Systems. Springer Berlin Heidelberg, 2010. 41-56.
Ghazal, Ahmad, et al. "BigBench: towards an industry standard benchmark for big data analytics." Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 2013.
Proposed metric:
Ghazal, Ahmad, et al. "BigBench: towards an industry standard benchmark for big data analytics."
Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 2013.
TPCx-BB: http://www.tpc.org/tpcx-bb/default.asp
http://www.csmic.org
Garg, Saurabh Kumar, Steve Versteeg, and Rajkumar Buyya. "A framework for ranking of cloud computing services"
Future Generation Computer Systems 29.4 (2013): 1012-1023.
SMICloud: http://www.csmic.org
Hwang, Kai, et al. "Cloud Performance Modeling with Benchmark Evaluation of Elastic Scaling Strategies." Parallel and Distributed Systems, IEEE Transactions on 27.1 (2016): 130-143.
(Does not exist yet)
Small, fit in cache
Obsolete instruction mix
Uncontrolled source code
Prone to compiler tricks
Short runtimes on modern machines
Single-number performance characterization with a single benchmark
Difficult to reproduce results (short runtime and low-precision UNIX timer)
Courtesy of NASA, NAS: https://www.nas.nasa.gov/publications/gallery.html