Many years ago, in my earliest IT jobs in Omaha, Nebraska, I realized the field of data was going to continue to evolve and, as such, there would always be a need for people who worked with and understood data. Regardless of which industry someone worked in—financial, medical, governmental, transportation, or retail—someone would have to work with and maintain business data, or the business would fail. This realization led me into the world of databases and SQL Server.
Since then, I’ve had the opportunity to watch the landscape of data positions evolve into various functions, such as data scientist, data architect, data analyst, business intelligence analyst, machine learning engineer—and the list continues. While the evolution of these data-related positions will continue, one aspect remains constant: each position plays some role in data stewardship, and we are all data stewards in one way or another.
Let’s talk about what data stewardship is and what it means to be a steward of the data.
What Is Data Stewardship?
One of the challenges of managing data in organizations is system and database administrators (DBAs) typically lack a broader understanding of the data itself, which is important to managing data governance.
Data governance is the process of ensuring critical business data, such as customer lists, invoices, etc., are defined consistently. The data landscape contains the business data generated by applications or external sources.
Merriam Webster defines stewardship as “the conducting, supervising, or managing of something.”
Combining these explanations to simplify the definition of data stewardship can help you understand the data steward’s important role in the supervision and management of data and data governance.
What Is a Data Steward?
The data steward role typically lives within the business team and works with the technology team to ensure the organization’s data management and security.
As mentioned, the data steward role falls under the overarching umbrella of data governance. Some organizations have a dedicated, broad data governance group, whereas others may outsource the work to individual business units for better business domain knowledge. Data governance encompasses several different vital decision-makers. Participants usually include business owners, technologists, and data representatives (including stewards).
The data governance group’s primary responsibility is to define and establish data management policies at an organizational level. The data management policies help ensure protection from abuse, damage, theft, or other adverse events.
One often overlooked use for data governance is data classification, which allows you to classify data, most commonly by sensitivity or regulation. In the event of a data breach or external attack, classifications let you more easily identify an appropriate response based on what type of data was exposed. Data stewards work with teams to ensure these rules are followed and implement processes to manage data.
Why Is Data Stewardship Important?
Simply put, data stewardship is trust. Performing proper data stewardship instills a level of trust across all levels of any organization. This trust allows organizations to use data to ensure sound business decisions, both operational and strategic. Data must be reliable in either case. Ensuring proper data stewardship helps promote confidence to consumers that the quality of the data is accurate and reliable.
If the consumers cannot trust the data, what do you have left? Nothing but a large mess of useless data, which no organization wants after such a considerable investment of time, money, and effort in generating the critical data.
Without data stewardship, you may have zero return on your investment.
What Does a Data Steward Do?
The roles of a data steward can be far-reaching or narrow in scope. The organization needs to define what the role looks like, with some vital elements of a data steward to prioritize.
The role of a data steward may or may not be a technical one, but it is a critical piece of the business process. Although the job role requires a wide breadth of knowledge, it doesn’t necessarily require hands-on technical knowledge. Instead, it requires knowing organizations must take specific actions to ensure critical aspects of the data, such as safety and quality, are handled appropriately. Often, the role of a data steward is a business role instead of a technical role.
Whether the data steward occupies a technical role is up to the organization. I often see this role more in line with a data analytics team whose primary focus is to provide comprehensive analytics and reporting of ongoing business activities.
It’s also important to distinguish the data steward from the data owner. Usually, they’re not the same people. The data owner is responsible for overall decisions on each respective source system. Data owners control appropriate uses of data within their jurisdiction. If there’s a defined data owner, the data steward serves as an agent of the data owner by providing expert knowledge of the data, appropriate data tools, or data processes, and enforcing data governance regulations and policies. Let’s look closer at some of these job functions and initiatives.
Maintain Data Quality
One of the significant factors in any organization is its data quality. If data quality is poor, the data cannot be trusted and thus should be discarded. A data steward works closely with any system of record to ensure proper controls are in place and are maintained to ensure the data produced is of high quality. These data quality requirements might or might not originate from the data governance group.
Ensure Data Safety
In the organizations I’ve worked in, the mantra is: security is the job of everyone. Data safety is no different. Many organizations take great strides to ensure the safety of their proprietary data using controls, such as data encryption, access control lists, security audits, and others. Data stewards should be in close harmony with the groups physically handling the technical security aspects.
Perform Data Validation
All decisions in life are based on information or data from somewhere. Whether the source is an application deployed by your organization or from your own experiences doesn’t matter. What does matter is how valid the data is when making the decision. You could make erroneous decisions if the data is faulty or not trustworthy, resulting in potentially catastrophic outcomes. To help prevent these situations, data stewards strive to ensure the data produced by source systems is as valid as possible, so downstream decisions are more accurate.
Data Standards and Guidelines
Standards and guidelines provide a method for organizations to ensure consistent outcomes. Developing standards for data streams can be more complex than other standards because the task requires a high-level view of the data. This high-level overview ensures all key points of the data lifecycle are accounted for and addressed. A data steward will play a crucial role in developing and maintaining any standards or guidelines an organization requires.
Data Lifecycle Oversight
As organizations grow, the need for new applications and processes will continue to evolve. Since these applications and processes are dependent on data from start to finish, a data steward should be involved from the start to monitor the lifecycle of the data. They should take steps to ensure the data satisfies any standards or guidelines. They should also track the data from its inception and any continuation of the data flow to downstream consumers.
Why Data Stewardship Is Here to Stay
Ensuring data success within your organization is the responsibility of all data roles, even if it’s not your primary job function. As data professionals, we should all strive to better understand data and its power to facilitate change, especially as data quality takes a front-stage role as a potentially more valuable asset to organizations than physical assets.
As you start to understand and work towards better data stewardship, make sure to check out SolarWinds Database Mapper. Its capabilities can help ensure data stewardship is being addressed and handled appropriately. Free 14-days trials are available for both cloud and on-premises versions.
John is a Principal Consultant with Denny Cherry & Associates Consulting holding Microsoft Data Platform MVP and VMware vExpert awards. He specializes in deploying SQL Server related solutions to solve business needs for organizations.