Sensitive Data

Detection Pipelines

Luke Hedger

AWS Community Builder

AWS Summit London - April 2022

Amazon Macie

Fully managed data security service that uses machine learning to discover sensitive data in AWS workloads

Architecture

Operational Excellence

- Monitor key pipeline metrics in CloudWatch

- Alert with visibility and actionability (ChatOps)

- Test in production with CloudWatch Synthetics

Cost Optimisation

- Compress Kinesis data delivered to S3 (GZIP)

- Reduce S3 objects analysed by Macie (Lifecycle Policy)

- Archive infrequently accessed S3 objects (Intelligent Tiering)

Security

- Encrypt all data at rest and in transit with KMS

- Record activity via CloudTrail, CloudFormation

- Aggregate security findings in Security Hub

Resources

- Deployable pipeline github.com/lukehedger/cdk-macie

- More from me twitter.com/level_out

- These slides 🤳👇

Thanks!

Serverless Sensitive Data Detection

By Luke

Serverless Sensitive Data Detection

Using Amazon Macie to build serverless data pipelines for detecting sensitive data leaks

  • 332