Getting better outcomes from AI? Check your data quality with business rules
What organization doesn’t have a plan lying around somewhere that contains the terms “big data” or “data-driven”? Whether you’re targeting new customer groups, doing predictive maintenance or process mining, in the end, it all comes down to the same thing: you want to use the data that is currently stored – as part of the process – in some system to make better decisions.
The predictive value of AI is nil if the data quality isn’t good
Many companies see the value of using data more effectively in their day-to-day operations. In reality, it usually ends up as nothing more than a few buzzwords and one-liners from CTOs and CIOs. And that’s because it’s quite a struggle to gather the various types of data required while ensuring quality and integrity. This part of the field is not sexy, but it is necessary. Because if you are going to use data to make better decisions, the quality of the decision will depend on the quality of the data.
Many organizations getting started with artificial intelligence (AI) find that the patterns they discover in their data have too little predictive value because the underlying data is of insufficient quality. Or the developed algorithms promise a lot of benefits during development, but in reality, the data turns out to vary in quality. After all: garbage in is garbage out. Conceptually, you can still develop the best AI algorithms, but the predictive value is nil if the data quality and data integrity aren’t good.
It isn’t easy to guarantee data quality and integrity
Data quality is often low when it’s merged using heterogeneous sources. Data is rarely correct, complete and current. You usually only find this out after merging data from different sources, because then you can see if the data is consistent. Also, different systems often use different definitions for the same term. In one system, a customer is someone who has paid an invoice. In another, companies that have issued an order but haven’t yet received an invoice are counted as customers. And a third system even counts prospects as customers.
Define your data quality requirements in business rules
If you don’t just want to develop fun AI algorithms, but also want to get results of predictable and usable quality, a rules engine is invaluable. You then have a way to clean up the data first and make it consistent, instead of setting up massive data quality projects to clean up your data in existing legacy systems. In the business rules, you determine which requirements the data must meet to be used for analysis purposes. You can also integrate AI algorithms into this rules engine because AI algorithms are simply complex rules that are applied to a large set of data.
Now, many organizations may say, “But we don’t have a rules engine!” That’s not a problem. Setting up a rules engine is certainly no more complicated than the work that traditionally takes place in the realization of information systems. And this approach to the field of data quality and data integrity brings excellent benefits.
How does it work in practice?
We often describe how it works using the example of a hypothetical production company that wants to get started with predictive maintenance. This company uses an Enterprise Asset Management (EAM) system that schedules and records maintenance on its machines. If this factory wants to start with predictive maintenance, it can use a rules engine that focuses on identifying a machine that needs maintenance earlier. This is achieved in four steps:
- The production company record all rules regarding preventive maintenance in the rules engine. For example, the rule that a machine is shut down at least once a year for major maintenance.
- Next, the production company determine what data they need in order to identify whether a machine part needs maintenance soon. For example, data about the maintenance history, the number of running hours since its last service, faults that often occur with this type of machine, data from sensors (Industrial IoT) that measure vibrations, or increased noise from the machine.
- The company also define business rules in the rules engine that express the level of quality the data must have to be used.
- And finally, the company add the algorithm that data scientists have developed to their rules engine, because an AI algorithm is nothing more than a business rule in their rules engine.
This way, the factory integrates AI with its processes in a high-quality manner without building a complex integration between its back-end system and the AI – and all without having to check each source system for data quality beforehand!
Would you like to know more about how you can guarantee data quality with a business-rules approach when working with AI? Contact us today.