1. API: Emit change notification {CREATED, UPDATED, DELETED}
2. Kafka: stores change notifications
3. Live Indexer: Consume Change notifications
4. Live Indexer: Expands change notification to full object
5. Elasticsearch: Index object
Cronjob that reads objects from MySQL using an unbuffered connection and streams inputs into Elasticsearch.
Indexing rate: ~9k/sec => ~3 hrs
Pros:
Cons:
Logstash: Extract, Transform, Load tool
Pros:
Cons:
File Beat: Application log shipper
150k/sec
11k/sec
4k/sec
Approach | Indexing Rate | Index build time |
---|---|---|
Logstash | ~1.2k/sec | 21 hours |
Live Indexer (python) | ~4k/sec | 6.25 hours |
Bulk Indexer (unbuffered MySQL connection) | ~9k/sec | ~3 hours |
Bulk Indexer:
9k/sec => ~3hrs
Live Indexer:
max: 4k/sec