What it is, why you need it, and best practices. This guide provides definitions and practical advice to help you understand and establish modern metadata management.
Metadata management refers to the organization and control of data which describes technical, business, or operational aspects of other data. It involves a range of processes, policies, and technologies which describe and give meaning to your data via searchable key attributes such as order number or customer ID. Ultimately, managed metadata makes it easier for all types of users to find, understand, and access the specific information assets they need.
Metadata describes technical, business, or operational aspects of other data. This provides you context so you can find the information you need more easily and use your data more effectively.
What is “other data”? By other data, we mean a collection of facts which represent measurements or descriptions of a situation. These facts can be in the form of numbers, symbols or words and are typically stored digitally. This source data, or raw data, is the facts in the original form and structure that they were collected. For analysis to happen, this raw data needs to be transformed into clean, business ready information through data integration.
Now let’s get back to defining metadata. To help you to find, understand, and access the information you need, this “other data” needs to have metadata associated with it. The metadata identifies other data and gives it context by providing core information about it, such as author, creation time, file size, file type, topic, etc.
There are three main types of metadata:
There are two ways to create metadata:
Enterprise metadata management helps you find the data you need and trust that that data is accurate. Your company likely has a large volume of complex data coming from many sources. And you need to be able to find, understand and trust the right information to gain actionable insights that improve your business. Here are the key benefits of robust metadata management as part of your data governance framework:
Metadata management is the only element of your overall data management and governance framework which focuses on metadata rather than the actual data itself. Metadata management tools allow you to automatically separate and load all types of metadata generated from a variety of systems such as your applications, data integration tools, data lake architecture, data warehouse, and data marts.
Enterprise metadata management aids in every phase of your data lifecycle:
Active Metadata Management employs artificial intelligence (AI) and/or machine learning (ML) to automatically profile, tag, classify and give lineage information to metadata, make metadata recommendations, and identify incorrect or missing data. This modern approach is driven in part by the rise of data from edge devices and IoT and also by the greater accessibility of AI and AutoML.
At a high level, the primary use cases for metadata management are data governance and data analysis. Managed metadata ensures that all groups in your organization comply with your data governance framework and it helps them find answers to their questions. Let’s look at three key constituents and how they might use metadata management:
Your solution to manage metadata will depend on the complexity and scale of your data sources and the variety of users and use cases you need to support. Still, below are five best practices for establishing and maintaining robust metadata management.
1. Define Your Metadata Strategy. You should start by identifying your short- and long-term use cases and the types of information you want to manage metadata for. Make sure these align with your overall business objectives and digital transformation program.
2. Define Scope and Roles. Be clear how metadata will support data analysis, data quality, data governance, and compliance needs, both now and in the future. Codify the requirements for each area and the roles of metadata managers, creators, and consumers. You’ll want to gather metadata from a wide variety of data sources, both on-premises and multi-cloud.
3. Define Policy to Ensure Quality Metadata. Your policy should ensure that metadata is consistently captured, stored and governed at the level of terms, attributes, and elements. The terms level refers to the standard business definitions and language for your organization. Attributes refers to data models, data dictionaries, or system documentation. Elements refers to database reports or tables which could come from spreadsheets, database catalogs, or data models. Be sure to include the source of your metadata in your data lineage. Adopting metadata standards such as the DoD Data Strategy will help you achieve consistent metadata interpretation with your ecosystem of vendors and partners.
4. Define Requirements for Your Tool. Once you’ve defined your strategy and scope, you’re ready to define the primary capabilities you need from your metadata management tool. For example, scalable storage and search functionality may be your top criteria.
You could also decide it’s important to take advantage of AI & ML. As stated above, active metadata management automatically profiles, tags, classifies and gives lineage information to metadata. It also makes metadata recommendations, and identifies incorrect or missing data.5. Define Your Long-Term Program. Now that you’ve implemented your tool, be sure to get buy-in from all stakeholders across your organization to make managing metadata an on-going program and process. Then maintain regular communication with these stakeholders of your program goals and issues. Plus, you should identify metadata stewards who will implement your policies and conduct periodic audits and reviews to identify areas for improvement.
Modern data integration delivers real-time, analytics-ready and actionable data to any analytics environment, from Qlik to Tableau, Power BI and beyond.