Dispatch /

On the Principals of Data Management

15 Sep 2012

To be clear, this is not an article about technology. Rather, it is about the organizational, cultural and strategic factors that must be considered to improve the management of data, or information, within organisations.

There are two types of data: structured data and unstructured data.  Structured data refers to data that is kept and managed through database management systems.  Unstructured data refers to data refers to information that either does not have a pre-defined data model and/or does not fit well into relational tables. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well.

Effective information management is not simple, but this article draws together a number of ‘critical success factors’ for management of structured data for business transaction processing at or on behalf of an organization. These do not provide an exhaustive list, but do offer a series of principles that can be used to guide the planning and implementation of data management activities. While there is some overlap with business intelligence best practices, these principles do not universally apply to data kept for analytical processing in data warehouses. < h3>Data Assets</h3>

Data will be safeguarded, protected, and managed as a valuable asset.  The organization will maximize the value of data to the enterprise while protecting the confidentiality of valuable personal information.

Data Ownership

All data is owned by the business.   For management purposes, a Data Owner will be identified and will have accountability for data within a defined business function.  A Data Steward will also be identified within each business function and will be able to provide hands-on management expertise and guidance for specific data subject areas.

Data Control

The enterprise will maintain possession and/or control of data.

Requirements Driven Data

Data will be driven by business requirements and aligned with business processes.

Data Utility

Data will be managed to provide maximum utility to the business.  Data will be stored and presented in a manner that is useful for the business.

Data Timeliness

The timeliness (also currency or latency) of data must be explicitly documented in data design documents and be acceptable to the business.  Data timeliness measures the expected frequency rate at which data elements within a database will be refreshed.  Data timeliness defines the “lifetime” of a data value before it expires or needs to be updated.

Data Sharing

Data is shared across the enterprise and is made available to all projects and personnel that need it for a legitimate business purpose.  Data owners, data stewards, and data architects will work to incorporate the needs to all business stakeholders when designing databases.

Semantic Consistency

Each data element should have a consistent meaning across the enterprise.  A consistent vocabulary should be used across the company and should be reflected in an enterprise data dictionary.   All data elements should have standard names and meaningful business definitions.  All data elements should follow consistent and meaningful naming conventions.  Data elements should have consistent metadata across the enterprise which should include datatypes, lengths, and allowable values or permissible ranges.  Every data element should have a definition that is published and available to users, developers, and management.

Data Authenticity

Each data element must be used in a manner that is consistent with its labeling.

Authoritative Data Sources

There is an authoritative source for every critical data element.  An official data source should be declared and used for important data elements.  Data should be created, updated, and deleted in the authoritative data source.   Data should be mapped from authoritative data sources to other systems.

Data Replication

Data replication is minimized and controlled.  When data is duplicated, there should be a valid business or technical reason and the methods of synchronizing the data with the authoritative source should be explicitly defined.

Data Standardization

Data standards will be created for enterprise data elements:  the most important, commonly shared, and commonly used data.  Each data standard will identify a common definition, standard metadata, an authoritative data source, data governance information, and provide a guide for proper use for an enterprise data element.

Data Integrity

Databases should be designed and managed to ensure that any data entered is accurate, valid, and consistent.  Data architecture rules should be established to ensure entity integrity, referential integrity, and domain integrity.

Data Quality

Data should have known quality rules.  The business should set expectations for the quality of data in a system so that business requirements can be met.

Standards, Regulations, and Laws

Projects using or creating data must comply with enterprise data standards and all applicable security, information protection, data retention, and legal requirements.

Data Security

Enterprise data will have integrity, will be trustworthy, and will be safeguarded from unauthorized access, whether malicious, fraudulent, or erroneous.  Data cannot be modified without authorization.

Twitter Facebook