Most of my adult life was spent doing consulting work for clients around the world on topics like information management and analytics. For last few years, I have also been involved in IOT, AI etc. When I look back – one thing has been common for all “data projects”, and that is the importance of TRUST…or lack there of. Billions of dollars have been spent creating BI systems ,data warehouses, data lakes, cubes, reports, workbooks, blah blah – and yet at the end of the day, spreadsheets still prevail at EVERY SINGLE COMPANY I know ! I wrote about this in the past.
But why ?
There are good reasons why users don’t trust data they get from BI, like
- In most cases, users cannot see how it ties back to source systems
- Transformations and enrichments to the data are not transparent to users
- Users do not know how well the system has been tested
- The tools may not be as intuitive as a spreadsheet….and many more
Every single one of these problems in “Classic BI” has a solution – which either a product or a service can solve. An IT expert can probably show what transformations happened, or create some reconciliation reports for example. Or a data lineage tool can trace back from report to source. So with some additional cost, we can minimize the trust issue – though the cost may eventually become prohibitive to build trust at scale.
Past is set in stone …most of the time, at least
The reason we could do all of this is because we are essentially looking at things that already happened – which are kind of “set in stone”. And the information we got from such systems had finite values as answers like “sales in north america was $10M” which is based on basic arithmetic . If I asked the same question again tomorrow, I will get the exact same answer of $10M. If I did not – I would know right away that something bad happened. If a bad decision was made – for the most part, it is possible to trace back and validate the data and confirm if it was indeed bad, and prevent it from happening in future.
Now lets look at the world of AI .
To avoid a religious war on terminology – please allow me to use AI in its most generic sense as an all encompassing thing that includes what we call data science, machine learning, cognitive etc. Definitions matter – but for this post humor me and pretend all the right things are covered when I say AI.
Just like with classic BI, we use a lot of data and transformations. However, the fundamental idea now is that we are not just reporting on what happened – we are trying to make the best educated guess on what will happen in future. We are not in the world of “only finite answers” here – instead of the exact sales that happened this quarter in North America, we are often trying to find what are the odds that sales will be greater than $10M next quarter, for example.
Enter the trust issue
On one hand, it is quite useful to have a system that makes such predictions for us so that we have a window to the future. On the other hand, if we do not trust the answer – it is a lot more difficult to explain how the system arrived at the answer. And if the system told me today that I had 80% chances of hitting $10M for the quarter, it could very well tell me tomorrow that I only had 50% chances of making that $10M number. I and everyone else in my team might think the system is foolish because we can see the math to get to the $10M number we want . Lets say the quarter finished and we did exceed the $10M number – this still does not mean the system was either right or wrong. That is the beauty (and pain) of how probability works !
Can’t the creator explain the creation ?
I am often asked “Can’t you just ask the programmer or data scientist who built this to explain how the system predicted?”. Yes I can – and some times that is all it takes to get the answer. But many a time, they may not be able to give that answer with the precision you expect. AI systems are learning systems (with or without human help) – and they learn and get smarter mostly by going through a lot of data, as opposed to just crunching logic fed by a human. By the time I asked the $10M question the second time, the system might have learned something from a new pattern it detected.
AI can piss you off
A sales forecast, in the larger scheme of things, is probably not going to change the world for most of us. However, if we think of other scenarios like say salary planning or promotions where an AI system scores everyone in a team on a complex set of parameters and makes recommendations – it is hard to accept a decision that cannot be explained in an easy to understand way. The system may be totally right – or it might have all kinds of bias built into it with the model, or the data it trained with. It might have false positives and false negatives. There are techniques to minimize all these problems – BUT If it cannot explain its results to us as users – how will we know for sure ?
Can you trust machines ?
There is another version of the trust issue – when machines need to make choices that affect us. Lets say you are a factory supervisor driving in a self driving forklift that is picking up a heavy load from one top shelf and putting it in another top shelf, while your workers are walking below its arm. The machine probably has visual recognition capabilities, and can crunch lots of parameters from data and make good decisions. Lets say one day, the machine detects the load is too much to bear and it has two options – flip on the side and injure you, who is sitting inside or drop the load and injure your workers. What should that machine do ? And if you don’t know what the machine will do – or at least know that you can over ride it – will you work with that machine ?
AI – its just like us, except it isn’t
I also get asked “Well, AI is supposed to think like a human, so why can’t it explain its thought process like we can?”. This is an excellent question and it presents two issues. 1. We don’t all think alike – even in the fictitious forklift example, I am sure different people will choose differently. And 2. We often take a decision, and find an explanation for the decision later if someone asks. We can’t always explain our decisions very well either to someone else except for simple cases. And finally, we make poor decisions too. So mimicking human thinking as-is perhaps is not the best way to think about AI either
AI is everywhere, and mostly harmless
I am of course not generalizing that all AI scenarios run into a trust ( or ethical or moral) issue. Many don’t – for example an AI algorithm might predict how much longer a device will work before battery runs out. I doubt I will have a trust issue if I see it work approximately well for first few times. And there are several of those kinds of “little” AI solutions all around us – and many might not ever be visible to us. We just take them for granted ! Even in the sales forecasting or promotion examples – over a period of time, we may trust what the system tells us. But the trouble is – will we give it enough time to let it work long enough to earn our trust ?
So what can we do, really ?
Just like other projects, AI projects need some basic education and expectation setting for stake holders before we embark on them. Unlike basic math, and if-then-else logic – statistics concepts needs a bit more hand holding. People tend to use terms like confidence, significance, sampling etc loosely and it is very easy to set wrong expectations with stake holders even with the right intentions. And then there is the issue of trust, and its ethical and moral considerations. It is important to discuss these thoroughly upfront, and during the projects . When done right – and transparently – AI can and does add significant value to us. Its on all of us in this industry to make sure we let AI earn trust the right way !
(Cross-posted @ Vijay's thoughts on all things big and small)