Data governance system
The content of the data governance system is viewed from two dimensions:
1) Difficulties and pain points in data governance: unclear data context, insufficient data aggregation capabilities, weak data management and control capabilities, imperfect data governance systems, and imperfect open forms.
2) The five cores of data governance: management, aggregation, management, governance, and use.
The data governance system mainly includes data standards, metadata, data modeling, data integration, data lifecycle, data quality, data openness, data security, and data application.
Metadata
2.1. Problems solved by metadata
What data is available - what is the data - where it comes from - how it flows - who has access to it
Its essence is also a kind of data, and metadata management is the basis for data asset management.
2.2. Metadata classification
Business metadata: Data that describes concepts, relationships, and rules related to the business domain in a data systemIncluding business terms, information classification, indicators, statistical caliber, etc. (described from a business perspective).
Technical metadata: Data that describes concepts, relationships, and rules related to the technical domain of a data system;It includes the definition of objects and data structures in the data platform, the mapping of source data to destination data, and the description of data conversion and processing processes.
Manage metadata: Data that describes concepts, relationships, and rules related to the management domain in a data systemIt mainly includes information such as personnel roles, job responsibilities, and management processes.
2.3. Metadata model maturity
1. Phase 1: Metadata needs to be managed manually, i.e., additional steps are required outside the data governance process.
2. The second stage: automatically generate metadata in the data exploration stage.
3. The third stage: automatically construct the metadata of data flow.
2.4. Metadata construction objectives and management methods
2.5. Metadata management
Metadata Management Methods:
Metadata Management Capabilities:
Data standards
Common data standards include basic data standards and indicator data standards.
Main components: business definition + management information + technical attributes.
The data standard includes the following contents: Subject & Category + Label Attribute + Standard**.
3.2.2. Data standard type (example).
Standards vary from industry to industry, and here are just some examples.
Such as: gender, ID card, amount, mobile phone number, industry, level of classification and classification, etc
3.3. The data standard management system involves ideas
Data standards are for the business and for the business.
Construction is carried out according to existing standards
Basic data standards: a business-oriented perspective.
Indicator data standards: from a management perspective.
The development of data standards is a business-orientedManagement is the leading, external requirements are the basis, and the current situation of the enterprise is the basisof the compatibility process.
3.4. Data standard architecture system
By unifying standards and architecture specifications, unifying indicators, terminology, models, and information items, solving problems such as unclear interpretation of data caliber and inconsistent business and data understanding, and realizing the unification of data at the architecture level
However, not all basic data needs to establish standards, and the data items included in the standards need to meet the access principles of sharing, importance and feasibility.
3.5. Principles for the construction of management data standards
Definitions: The business implications of analytical data standards are consistent with the applicable business scenarios.
Caliber: The business rules such as the service value range, calculation method, and coding rules of the analytical data standard are maintained.
Name: The standard Chinese and English names of analytical data adopt a unified naming rule, and the names of information items that represent the same business meaning should be kept all the time.
Reference: The external standards (including international standards, national standards, and industry standards) referred to when standardizing the standard items of each analytical data, and the internal business systems and business specifications should be consistent.
:Each analytical data standard should have an authoritative system;Other systems should use this information directly from authoritative system results for consistency.
The following is an example of an enterprise data standard system framework, which is divided into basic data standards and management data standards.
3.6. Data standard life cycle management
Data modeling
4.1. Concept
An enterprise-wide approach to data modelingStarting from the overall situation, it involves standardizing the data model, building a unified data model management and control system, enriching and improving the relevant attribute information of data entities, sorting out the logical relationship between data entities, and finally forming data models with different subject areas.
4.2. Data model classification
4.3. Data model life cycle
4.4. Case
Data integration
5.1. Concept
Data integrationIt mainly refers to the process of re-centralization and unified management of business data based on the scattered information system of the enterprise, which is a gradual process, mainly new and different data are generated, and there are continuous steps and program implementation of data integration. Data integration is the logical or physical organic concentration of data of different formats and characteristics, so as to provide basic support for enterprise data sharing.
5.2. The overall architecture of data integration
Data lifecycle
6.1. Phased division
It is divided into two major stages: data governance planning stage + data lifecycle management stage.
Data governance planning phase
Business planning definition stage: business planning and business standard design.
Application design and implementation stage: data model design, application standard design, application design implementation, and data entry.
Data lifecycle management phase
Data creation: Use the data model to ensure data integrity, implement data standards to ensure data accuracy, add data quality checks to create accurate, and ensure that data is generated in a reasonable system
Data use: use metadata to monitor data use, use data standards to ensure data accuracy, use data quality inspection and processing accuracy, ensure that data is used in a reasonable system, and control data derivation
Data archiving: Use evaluation methods to ensure the timing of archiving and file data by data type
Data destruction: Use evaluation methods to ensure the timing of data destruction, and destroy data by data type.
Requirements:
Meet the requirements of policies and management systems related to historical data query.
Meet the needs of business operations and management analytics.
Meet audit management requirements.
Reduce data redundancy and improve data consistency.
Infrastructure investment in storage, hardware, operation and maintenance, etc.
Improve application performance and responsiveness.
6.2. Management requirements and means
6.3. Management norms and management methods
Data quality
7.1. Data quality management objectives
Develop a management approach that meets data quality requirements based on the needs of data consumers.
Define standards and norms for data quality control and write about a portion of the entire data lifecycle.
The process of defining and factualizing, monitoring, and reporting data quality levels.
Identify and advocate for opportunities to improve data quality by changing processes and systems, as well as engaging in activities that can significantly improve data quality, in line with data consumer requirements.
7.2. Life cycle
Planning phase: The data quality team evaluates the scope, impact, and prioritization of known issues and evaluates alternatives to address them.
Execution plan: The data quality team is responsible for working to address the root cause of the problem and making a plan for continuously monitoring the data (technical issues, process issues).
Inspection phase: This phase includes active monitoring of the quality of the data as required.
Processing phase: The activity of addressing and resolving emerging data quality issues.
7.3. Data quality dimension
7.4. Common tools for data quality
Data development
aroundData value channel(Data Assets -> Data Services -> Business Applications) to design the whole process management of data development and promote the release of data value.
8.1. Data assets
The application and implementation of data assets can open up the basic data chain, achieve connectivity and collaboration, and enhance the value of data.
Data asset lifecycle: registration, change, monitoring, and decommissioning.
8.2. Data services
Data Services Technical Architecture:
Data Security
The data security system includes: data security technology system + security management system + security operation system.
etl
10.1. Meaning
10.2. ETL mode
Trigger Mode:
Incremental Field Pattern:
Full Synchronization Mode:
Log comparison mode:
Comparison of different models:
10.3. Offline and real-time
Real-time data:
Offline data:
Usage Scenarios: