CAPSiDE
We are a systems engineering company that boosts our clients’ businesses through the design, implementation & 24x7 engineering of highly efficient, scalable and reliable systems architectures.
Javi Moreno
@ciberado
javi.moreno@capside.com
Quick and fast intro
Datalakes with S3
Format transformation and ETL with Glue
Ingestion with Kinesis Streams y Firehose
Preparing raw data with Lambda
Realtime processing with Kinesis Analytics
DynamoDB for processed dataon
Querying the datalake with Athena
Transforming massive amounts of data with EMR/Hive
Interactive queries with Redshift
Spectrum for integrating Redshift and S3
Data visualization with QuickSight
Realtime dashboards with Kinesis Client Library
Cleanup and conclusion
Unlimited and affordable object-based storage
99.999999999% data durability
Easy world-wide replication
Data encryption at rest and on the fly
Versioning and life cycle management
Extract, transform and load managed service
Data catalog for partition management
Optional schema discovery with Crawlers
Advanced job schedule
Code generation
Managed near real-time queue
Capacity measured in Shards (1000 rps or 1MBs)
Unlimited ingestion scalability
Up to seven days retention period
Easy processing
Long-term storage integration with Firehose
Serverless star product
Integrated with most services
$0.20 per 1.000.000 invocations
Nodejs, Python, Java, C# and Java
C9 editor integrated with the console
Seamless massive autoscaling
Codeless near real-time processing
Powered by a SQL-like language
Input transformation available
Results can be processed with Lambda
Text
Text
Semi-structured database
Very fast and predictable performance
Integrated autoscaling
Global tables
Easy backup procedure
SDK based support for JSON
Query S3 without ETL
Serverless Presto-based infrastructure
Standard SQL variant supported
Pay per scanned amount of data
Managed infrastructure for Yarn
Supports Hadoop MapReduce, Presto, Spark...
Integrated with S3
Master node, core nodes and task nodes
Scale up without problems
Cost effective with spot instances support
Columnar based storage
Interactive query over 1.5PB of data
Postgresql SQL dialect and connectivity
Included snapshot for backup
Change cluster size at any time
Extreme performance for aggregation
Massive managed cluster
Integrated with Redshift nodes
Join with tables stored directly on S3
Pay for amount of scanned data
SQL dialect compatible with Redshift
Software as a Service data explorer
Powered by S.P.I.C.E on-memory engine
Flexible visualization library
Easy-to-use and attractive user interface
Integrated with most data sources
Powerful java wrapper over Kinesis API
Automatic recovery from failure based on DynamoDB
Supports batch processing
Makes easy to create scalable consumers
Designed to be run on EC2 instances
S3 buckets
EC2 instances ¡
Glue databases
DynamoDB tables
Athena catalogs
Redshift clusters
Quicksight datasets
By CAPSiDE