Sprint 42 Review
Spark/EMR
- Gained better understanding of processing kinesis streams
- Gained better understanding of logging and configuring logging within EMR
- Created repeatable steps for dev for launching cluster with Spark
-
Tested writing of individual records to
- S3
- ElasticSearch
- HDFS
- Local Filesystem
CIS Architecture
- Created document for CIS Architecture
- Shows changes to Dovetail System (domain events and pub/sub subsystem)
- Shows changes to Connect Insights for Students
- Several meetings have achieved a general consensus
- A "live query" approach is being considered instead of a CIS pipeline.
- Initial testing soon to determine if this is feasible.
- Sample data loaded into a production scale ES cluster that is actively indexing data is needed
MHE Metrics
https://github.mheducation.com/MHEducation/mhe-aws-metrics
- A general purpose library to support the recording of customer metrics
- Support for custom metrics
- Caches metrics for bulk writes (saves costs)
- Added to the dvtl-input-api
Caliper Utility Library
https://github.mheducation.com/MHEducation/mhe-caliper-utils
- Created a small library to perform operations on one or many Caliper events simultaneously
- getVersion(events) - Identifies Caliper events as 1.0beta or 1.0
- getShortName(events) - Returns a short name used to refer to the event
- upgradeVersion(events, version) - Updates a collection of events to the specified version
Amazon Elasticsearch
- Experienced problems on Production where memory constraints took out the Audit API
- Attempt will be made this week to upgrade the instances for this cluster
Dovetail Pipeline
- Created a Kinesis/Lambda version of the Spark/EMR pipeline we have been investigating
- Detects older versions of events
- Upgrades events to version 1.0
- Indexes events to Elasticsearch
- Adds audit record
- Coming soon
- SNS publishing
- Custom metrics
Multiple Indices ES Cluster
- Documented strategy
- Socialized the strategy with Richard
Output/Query API
- Pass-through implementation of ES query
- Utilizes the multi-indices alias
Sprint
By James Cook
Sprint
- 1,028