Data Governance
حوكمة البيانات
- Data governance (DG) is the overall management of the availability, usability, integrity, and security of data used in an enterprise
- Businesses benefit from data governance because it ensures data is consistent and trustworthy
- Data governance (DG) is a general term that covers multiple subjects such as:
- Data Catalog
- ETL procedure , Extracting, Transforming and Loading data
- Data Analysis and Processing
- Stream Processing for real time applications
- Batch Processing for offline applications
- Defining the owners or custodians of the data assets in the enterprise. (data stewardship)
- Processes must then be defined to effectively cover how the data will be stored, archived, backed up and protected from theft or attacks.
- A set of standards and procedures must be developed that defines how the data is to be used by authorized personnel
- A set of controls and audit procedures must be put into place that ensures ongoing compliance with internal data policies and external government regulations, and that guarantees data is used in a consistent manner across multiple enterprise applications
- Teams of data stewards: Acts as a communication channel between the IT department and the business side of an organization.
- includes
- Database administrators
- Business analysts
- Data architects
- Business intelligence developers
- Extract, transform and load (ETL) designers
- Business data owners
- Data Quality
- Data scrubbing, also known as data cleansing
- Master data management ( MDM )
- Metadata repositories, which hold data about data
- For governmental institutions.
- Example:
- The European Union's (EU's) directive concerning General Data Protection Regulation (GDPR) is an example of a use case for data governance.
-
- Enterprises with huge amount of data
-
Kylo
- https://kylo.io/Bullet Two
- Apache Atlas
-
Kylo is an open source enterprise-ready data lake management software platform for self-service data ingest and data preparation with integrated metadata management, governance, security and best practices
- Ingest: Self-service data ingest with data cleansing, validation, and automatic profiling.
-
Prepare: Wrangle data with visual SQL and an interactive transform through a simple user interface.
- Discover Search and explore data and metadata, view lineage, and profile statistics.
- Monitor: Monitor health of feeds and services in the data lake. Track SLAs and troubleshoot performance.
- Design: Design batch or streaming pipeline templates in Apache NiFi and register with Kylo to enable user self-service.
-
Speed to market
Kylo can accelerate your big data efforts, helping your program stay ahead of the competition -
Growth through innovation
Using Kylo, the prioritized use cases you select will help deliver business value and new opportunities across your company - Improved quality, security and governance
-
Cost reduction Kylo can help your organization build custom engineered data lakes at a fraction of the typical cost
-
Airline: 2 companies of top 15 global brands
-
Telecommunications: 2 companies of top 10 European brands
-
Banking: 2 companies of top 5 global brands
-
Insurance: 2 companies of top 10 US brands
-
Financial Services: 1 company of top 5 global brands
-
Retail and Consumer Goods: 2 companies of top 10 global brands
- US-based company
- Started developing the product years ago
- Now it's open source.
- Provides support and training for kylo's customers
Made with Slides.com