More organizations are supplementing their analytics capabilities with intelligent systems that are easier to use than ever. While the results may look impressive, the devil is in the details.
There is a key difference between traditional analytics systems and some of the newer analytics systems that is very important. If you understand the difference, you’ll be a step ahead of your peers.
Old: Input → Output
Traditional analytics systems tend to be rules-based which means they have “if/then” scenarios built into them, so if a user clicks the red button, one result occurs. If she clicks the blue button, then another result occurs. The key thing to know here is that, assuming the programming is done right, an input results in a predictable output. That’s great, but it doesn’t work so well with the complex Big Data we have today, which is why machine learning is gaining momentum.
Modern systems use machine learning to provide more intelligent solutions. The solutions are more “intelligent” because the machine learns what humans feed it, and depending on the algorithms used, they may be capable of learning on their own. Training by humans and self-learning allows such systems to “see” things in the data that weren’t apparent before, such as patterns and relationships. The other major value, of course, is the ability to comb through massive amounts of structured and unstructured data faster than a human could, understand the data, make predictions on it, and perhaps make recommendations. It is the latter characteristics — prediction and prescription — that are most obvious to analytics users.
What’s not well understood is what can potentially go wrong. An analytics system designed for general purpose use is likely not what someone on Wall Street would use. That person would want a solution that’s tailored to the needs of the financial services industry. Making the wrong movie prediction is one thing; making the wrong trade is another.
As users, it’s easy to assume that the analytics we get or come up with are accurate, but there is so much that can affect accuracy — data quality, algorithms, models, interpretation. And, as I mentioned in my last post, bias which can impact all of those things and more.
Why you should care
There is a shortage of really good data science and analytics talent. One answer to the problem is to build solutions that abstract the complexity of all the nasty stuff — data collection, data preparation, choice of algorithms and models, etc. — so business users don’t have to worry about it. On one hand, the abstraction is good because it enables solutions that are easy to use and don’t require much, if any, training.
But what if the underlying math or assumptions aren’t exactly right? How would you know what effect that might have? To understand how and why those systems are working the way they are requires someone who understands all the hairy technical stuff, like a car mechanic. That means, like a car, do not pop the hood and start tinkering with things unless you know what you’re doing.
Some solutions don’t have a pop-the-hood option. They’re black boxes, which means no one can see what’s going on inside. The opaqueness doesn’t make business users nervous, but it’s troublesome to experts who didn’t build the system in the first place.
Bottom line, you’re probably going to get spurious results once in a while, and when you do ask why. If it’s not obvious to you, ask for help.,