Intro to the Elastic Stack
Search, Visualization & Analytics

Agenda
- Business Environment & Challenges
- What is the Elastic Stack
- Demos
- Takeaways
- Q & A
What Questions Does Your Business Have?
- How many users signed up?
- How successful is our ad campaign?
- When should we schedule maintenance windows?
- Is someone trying to hack my website?
- Which feature is the most/least used?
Some Challenges
- Diverse and distributed architecture
- TMI!
- Inconsistencies abound
- Experts required
Enter Elastic Stack
- Collects and consolidates all kinds of data
- Enriches it and gives it meaning
- Makes it searchable (and fast)
- Puts it in the hands of the people who need it
What fresh new hell is this?
- Logstash
- Elasticsearch
- Kibana

- Ingests data from many different sources
- Parses, transforms and enriches data on-the-fly
- Pushes data to many different sources

Logstash
Logstash Ingestion Plug-Ins
- Elastic Beats Framework
- SalesForce (SOQL)
- Databases (Redis, CouchDB, Mongo, SQLite, jdbc)
- Files (csv, generic file)
- Events (syslog, Graylog2, Kafka, RabbitMQ)
- AWS (S3, SQS, Kinesis)
- Protocols (http, tcp, udp, STOMP, websocket)
- Social (Twitter, Jabber)
Logstash Filter Plug-ins
- Aggregation
- Anonymization
- Translation
- User-Agent
- Geo-IP
- Key-Value Pairs
- Grok
- Json
Logstash Output Plug-Ins
- Elasticsearch
- Solr
- Events (RabbitMQ, Redis, Kafka, ZeroMQ)
- Monitoring and Alerting (pagers, email, Zabbix, Riemann)
- AWS (CloudWatch, SNS, SQS)
- HDFS
Elasticsearch
- Heart of the Elastic Stack
- Built originally by Shay Bannon on Apache Lucene
- Distributed, RESTful search and analytics engine
- Massively scalable and cloud-ready
- AWS has infrastructure for spinning up a cluster
- Enterprise-ready (secure, resilient, reliable, monitored)
- Lots of ways to work with it
Elasticsearch APIs
- Built-in intuitive RESTful API
- Many client libraries (Java, .Net, R, Scala, Python, JavaScript, PHP)
- Native integration with Hadoop (Spark, Hive, Pig, Storm, MapReduce, Cascading)
Life Inside a Cluster
- Indexes, Shards & Documents
- Redundancy via Replicas
- Easily Horizontally Scalable
- Master, Data, Ingest and Tribe nodes
- Tools for Monitoring and Security
Some Really Cool Features
- Queries (relevance)
- Filters (exact value match, boolean)
- Percolator
- Full-Text Search
- Bucket Aggregations (date histogram, ranges, terms)
- Metrics Aggregations (Top hits, max/min/avg, Percentile, Stats)
- Geolocation
"Show me restaurants that mention vitello tonnato, are within a 5-minute walk, and are open at 11 p.m., and then rank them by a combination of user rating, distance, and price."
Some Query Examples
Basic Search & Count
Sample Aggregation
Analytics Workloads
- Aggregations
- Bucketing
- Metric
- Pipeline
- Integration with tools like Spark
Kibana
- Visualization tool for Elasticsearch data
- Basic visualization (histograms, pie charts, line graphs, etc.)
- Geo Data
- Time Series
- Graph Data (relationships)
- Dashboards and Reports
Sample Dashboard

Who Uses This Stuff?
- ebay
- Goldman Sachs
- Symantec
- Cisco
- NetFlix
- Microsoft
- IBM
- GitHub
- Mayo Clinic
- USAA
- FICO
- Salesforce
- CERN
- Adobe
- Home Depot
- ActiVision
- SoundCloud
- Yale University
- Verizon
https://www.elastic.co/use-cases
Some Demos
Takeaways
- Integrates with things we work with
- Salesforce, .Net, SQL Server, IIS, PowerBI
- Custom applications should emit operational data
- Opportunity to introduce DevOps to customers
- Far less expensive than competitors like Splunk
- Great alternative to OLAP cubes and traditional data warehousing
- Commodity hardware or run in cloud
- More agility
- Supports Big Data Initiatives
-
- Integrates with Hadoop (but is faster)
- Supports data science
Q & A
Thank You!
Intro to the Elastic Stack
By Jason R. Foster
Intro to the Elastic Stack
Brief introduction to Elasticsearch, Logstash and Kibana
- 239