Your organization can use data governance for AI/ML to lay the groundwork for innovative data-based tools.
Data management ensures that data is available, consistent, usable, trusted, and secure. It’s a concept that organizations struggle with, and the bar is raised when big data and systems like artificial intelligence and machine language come into the picture. Organizations quickly realize that AI/ML systems work differently from traditional, fixed record systems.
With AI/ML, the aim is not to return a value or status for a single transaction. Instead, an AI/ML system sifts through petabytes of data looking for answers to a question or an algorithm that even seems a little open-ended. Data is processed in parallel, with data threads being fed into the processor at the same time. The massive amounts of data being processed simultaneously and asynchronously can be pre-deleted by IT to speed up processing.
TO SEE: Hiring Kit: Database Engineer (Tech Republic Premium)
This data can come from many different internal and external sources. Each source has its own way of collecting, managing and storing data – and it may or may not meet your own organization’s governance standards. Then there are the recommendations from the AI itself. Do you trust them? These are just some of the questions businesses and their auditors face as they focus on AI/ML data governance and look for tools to help them.
Using data management for AI/ML systems
Make sure your data is consistent and accurate
When integrating data from internal and external transaction systems, the data must be standardized so that it can communicate and blend with data from other sources. Application programming interfaces pre-built into many systems so that they can exchange data with other systems facilitate this. If no APIs are available, you can use ETL toolsthat transfer data from one system to a format that another system can read.
If you add unstructured data, such as photographic, video, and sound objects, there are object linking tools that can link and relate these objects to each other. A good example of an object linker is a GIS system, which combines photos, schematics, and other types of data to provide a complete geographic context for a given environment.
Confirm that your data is usable
We often think of actionable data as data that users can access, but it’s more than that. If the data you keep has lost its value because it is out of date, it should be deleted. IT and business end users need to agree on when to delete data. This comes in the form of a data retention policy.
There are also other occasions when AI/ML data needs to be cleaned up. This happens when a data model for AI is changed and the data no longer fits the model.
In an AI/ML governance audit, examiners expect written policies and procedures for both types of data cleaning. They will also verify that your data cleansing practices are in line with industry standards. There are many data cleaning tools and utilities on the market.
Make sure your data is trusted
Circumstances are changing: An AI/ML system that once worked quite efficiently may begin to lose effectiveness. How do you know this? By regularly comparing AI/ML results with past performance and with what is happening in the world around you. If the accuracy of your AI/ML system is drifting away from you, you need to fix it.
The Amazon recruiting model is a good example. Amazon’s AI system concluded that it was best to hire male applicants because the system looked at past recruiting practices and most of them were male. What the model failed to adapt to move forward was a higher number of highly qualified female applicants. The AI/ML system had strayed from the truth and instead began to seed hiring bias in the system. From a regulatory point of view, the AI was not compliant.
TO SEE: Ethical Policy for Artificial Intelligence (Tech Republic Premium)
Amazon eventually de-implemented the system, but companies can avoid these mistakes if they regularly monitor system performance, compare it to past performance, and compare it to what is happening in the outside world. If the AI/ML model is out of sync, it can be adjusted.
There are AI/ML tools that data scientists use to measure model drift, but the most direct way for business professionals to check for drift is to compare AI/ML system performance to historical performance. For example, if you suddenly notice that weather forecasts are 30% less accurate, it’s time to check the data and the algorithms that your AI/ML system is running.