Strategic Insights and Clickworthy Content Development

Category: Analytics (Page 3 of 4)

16 Machine Learning Terms You Should Know

Advanced analytics is heating up. AI, machine learning, deep learning, and neural networks are just some of the terms we hear and should know more about. While most of us will never become statisticians or unicorn data scientists, it’s wise for us to understand some of the basic terms, especially since we’ll be hearing a lot more about machine learning in the coming years. Here are a few terms we should all know from some sites that have much more to offer:

Algorithm – a step by step procedure for solving a problem.

Attribute – a characteristic or property of an object.

Classification – to arrange in groups.

Clusters – groups of objects that share a characteristic that is distinct from other groups.

Correlation – the extent to which two numerical variables have a linear relationship.

Deep Learning – An AI function that imitates the workings of the human brain.

Decision Tree – a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility.

Natural Language Processing (NLP) – the automatic (or semi-automatic) processing of human language.

Neural Networks – a series of algorithms that attempts to identify underlying relationships in a set of data by using a process that mimics the way the human brain operates.

Normal Distribution – symmetrical distributions that have bell-shaped density curves with a single peak.

Outlier – an observation that lies an abnormal distance from other values in a random sample from a population.

Regression – a statistical process for estimating the relationships among variables.

Statistical Model – a formalization of relationship between variables in the form of mathematical equations.

Supervised Learning – accomplished with training data that includes both the input and the desired results.

Unsupervised Learning – accomplished with training data that does not include the desired results.

3 Cool AI Projects

AI is all around us, quietly working in the background or interacting with us via a number of different devices. Various industries are using AI for specific reasons such as ensuring that flights arrive on time or irrigating fields better and more economically.

Over time, our interactions with AI are becoming more sophisticated. In fact, in the not-too-distant future we’ll have personal digital assistants that know more about us than we know about ourselves.

For now, there are countless AI projects popping up in commercial, industrial and academic settings. Following are a few examples of projects with an extra cool factor.

Get Credit. Now.

Who among us hasn’t sat in a car dealership, waiting for the finance person to run a credit check and provide us with financing options? We’ve also stood in lines at stores, filling out credit applications, much to the dismay of those standing behind us in line. Experian DataLabs is working to change all that.

Experian created Experian DataLabs to experiment with the help of clients and partners. Located in San Diego, London, and Sao Paulo, Experian DataLabs employ scientists and software engineers, 70% of whom are Ph.Ds. Most of these professionals have backgrounds in machine learning.

“We’re going into the mobile market where we’re pulling together data, mobile, and some analytics work,” said Eric Haller, EVP of Experian’s Global DataLabs. “It’s cutting-edge machine learning which will allow for instant credit on your phone instead of applying for credit at the cash register.”

That goes for getting credit at car dealerships, too. Simply text a code to the car manufacturer and get the credit you need using your smartphone. Experian DataLabs is also combining the idea with Google Home, so you can shop for a car, and when you find one you like, you can ask Google Home for instant credit.

There’s no commercial product available yet, but a pilot will begin this summer.

AI About AI

Vicarious is attempting achieve human-level intelligence in vision, language, and motor control. It is taking advantage of neuroscience to reduce the amount of input machine learning requires to achieve a desired result. At the moment, Vicarious is focusing on mainstream deep learning and computer vision.

It’s concept is compelling to many investors. So far, the company has received $70 million from corporations, venture capitalists and affluent private investors including Ashton Kutcher, Jeff Bezos, and Elon Musk.

On its website, Vicarious wisely points out the downsides of model optimization ad infinitum that results in only incremental improvements. So, instead of trying to beat a state-of-the-art algorithm, Vicarious is to trying to identify and characterize the source of errors.

Draft Better Basketball Players

The Toronto Raptors is working with IBM Watson to identify what skills the team needs and which prospective players can best fill the gap. It is also pre-screening each potential recruits’ personality traits and character.

During the recruiting process, Watson helps select the best players and it also suggests ideal trade scenarios. While prospecting, scouts enter data into a platform to record their observations. The information is later used by Watson to evaluate players.

And, a Lesson in All of This

Vicarious is using unsupervised machine learning. The Toronto Raptors are using supervised learning, but perhaps not exclusively. If you don’t know the difference between the two yet, it’s important to know. Unsupervised learning looks for patterns. Supervised learning is presented with classifications such as these are the characteristics of “good” traits and these are the characteristics of “bad” traits.

Supervised and unsupervised learning are not mutually exclusive since unsupervised learning needs to start somewhere. However, supervised learning is more comfortable to humans with egos and biases because we are used to giving machines a set of rules (programming). It takes a strong ego, curiosity or both to accept that some of the most intriguing findings can come from unsupervised learning because it is not constrained by human biases. For example, we may define the world in terms of red, yellow and blue. Unsupervised learning could point out crimson, vermillion, banana, canary, cobalt, lapis and more.

Sports Teams Embrace Analytics and the IoT

In my last sports-related post, I explained how the National Hockey League (NHL) is using IoT devicesto provide the league with deeper insights about the players and the game while immersing fans in new stats and spectator experiences. The NHL is not alone. In fact, IoT devices are finding their way into all kinds of sports for the benefit of leagues, players, and fans.

For example, the National Football League has been placing sensors in player’s shoulder pads to track their location, speed, and distance traveled. Last year, it experimented with sensors in footballs to track their motion, including acceleration, distance, and velocity. That data is being sold to broadcast partners.

Meanwhile, young football players who hope to play the game professionally are tracking themselves hoping to become more attractive recruiting targets.

NBA Teams Score with Insights

The Golden Gate Warriors and Miami Heat are getting some interesting datafrom wearables and other sensors that track the movement of players and the basketball used in a game. Now it’s possible to analyze how players shoot, how high they jump, and the speed at which the ball travels, among other things. One thing that trips me up about it is how some of that data is visualized by the coach.

Picture this: The player clips a device to his shorts or wears the device on his wrist so his coach can understand the trajectory of the ball and get statistics about a player’s movements on a cell phone. The new insights help coaches and their teams understand the dynamics of the sport better, but I wonder how practical Basketball by Smartphone App is, given the speed at which the game is played.

Sensors placed somewhere on the players and in the basketball also provide information about players’ movements on the court over time. The visualization looks a like a plate of spaghetti, but within that are patterns that reveal players’ habits, such as the area of the court the player tends to favor.

Beyond Moneyball

Former Oakland A’s general manager Billy Beane is considered the father of sports analytics because in 1981 he was the first to change the makeup of a team and how a team played the sport based on what the numbers said. This is commonly known as “Moneyball” (thanks to the book and movie) or “Billyball.”

One interesting insight was base time. The more time a player spends on-base, the more likely that player will walk to first base rather than strike out.

However, Beane’s early experimentation also demonstrated that numbers aren’t everything. He was fired the next year (in 1982) for overworking pitchers. Stated another way, the stellar turnaround year was not followed by a similarly strong year.

These days, sensors are enabling Major League Baseball (MLB) statistics 2.0. For example, sensors in baseball bats provide insights about the speed and motion of a swing and the point of impact when a ball hits the bat. In the dugouts, coaches and players can get access to all kinds of insights via an iPad during the game. The insights enable them to fine-tune the way they play against the opposing team. It’s also possible to track the movements of a specific player.

NHL and Fans Score with Predictive Analytics

If you’re a hockey fan, you’ve probably noticed that the statistics are more comprehensive than they once were. That’s not happening by accident.

The National Hockey League (NHL) uses predictive analytics to learn more about fans, improve its direct marketing efforts, track players’ performance on the ice, and improve fan engagement.

Making an IoT Play

During the 2015 All-Star game, sensors were embedded inside pucks and players’ jersey collars which provided insight into where the puck and players were, how fast they were moving, puck trajectory, players’ time on ice and more.

The information was used during replays to better explain how a particular outcome came about. Fans were able to visualize the paths players and pucks had taken, giving them more insight into players’ performance. Experimentation continued at the World Cup of Hockey 2016, which was substantially the same thing — tracking pucks and players.

The key to winning a hockey match is puck possession. If Team A possesses the puck longer than Team B, Team A will score more points over time.

The information derived from the devices, particularly the jerseys, can be used for training purposes and to minimize injuries.

A Data Scientist Predicted Winners and Losers

A couple of years ago, the NHL worked with a data scientist who reviewed historical data including player statistics and team statistics over several seasons. When he crunched the data, he found that there are certain statistics and factors that, over time, can help predict team performance on the ice, especially in the playoffs.

A digital banking conversion is hard work. The right vendor can make it smoother and less stressful. Here are seven important considerations when choosing a vendor.

Thirty-seven different factors were weighted in certain ways and applied to the 16 teams that started the playoffs in April 2015. The goal was to predict how the playoff teams would do when playing against each other. And, as the rounds progressed, how the teams would perform in new matchups.

The results were very interesting. The data scientist was able to predict at the start of the season that the Chicago Blackhawks would win the Stanley Cup. He also was able to predict which team would win each playoff game, most of the time.

“What’s interesting about that is our sport is a pretty unpredictable sport,” said Chris Foster, director of Digital Business Development at National Hockey League, in an interview. “The action is so fast, goals happen rather infrequently and a lot of it has to do with a puck bouncing here or a save there. It’ very fast action that is sometimes hard to predict, but it just shows that data, when properly analyzed, and really smart models are put around it, that predictive analytics can tell you a lot about how a team is going to perform.”

How Cybersecurity Analytics Are Evolving

As the war between the black hats and white hats continues to escalate, cybersecurity necessarily evolves. In the past, black hats were rogue individuals. Now they’re hactivists, crime groups, and hackers backed by nation states.

“Hackers have gotten a lot more sophisticated,” said Sanjay Goel, a professor in the School of Business at University of Albany. “It used to be they’d break into networks, do some damage, and get out. Now they have persistent attacks and targeted execution.”

Hackers are automating attacks to constantly search for vulnerabilities in networks. Meanwhile, fraudulent communications are getting so sophisticated, they’re fooling even security-aware individuals. Analytics can help, but nothing is a silver bullet.

Moats Are Outdated

Organizations used to set up perimeter security to keep hackers from breaching their networks. Since that didn’t work, firewalls were supplemented with other mechanisms such as intrusion detection systems that alert security professionals to a breach and honey pots that lure hackers into a place where they can be monitored and prevented from causing damage.

Those tools are still useful, but they have necessarily been supplemented with other methods and tools to counter new and more frequent attacks. Collectively, these systems monitor networks, traffic, user behavior, access rights, and data assets, albeit at a grander scale than before, which has necessitated considerable automation. When a matter needs to be escalated to a human, analytical results are sent in the form of alerts, dashboards, and visualization capabilities.

“We really need to get away from depending on a security analyst that’s supposed to be watching a dashboard and get more into having fully-automated systems that take you right to remediation. You want to put your human resources at the end of the trail,” said Dave Trader, chief security officer at IT services company GalaxE.Solutions.

Predictive analytics analyzes behavior that indicates threats, vulnerabilities, and fraud. Slowly, but surely, cybersecurity budgets, analytics, and mindsets are shifting from prevention to detection and remediation because enterprises need to assume that their networks have been breached.

“All the hackers I know are counting on you not taking that remedial step, so when there’s a vulnerability and it’s a zero-day attack, the aggregator or correlators will catch it and then it will go into a ticket system so its three to four days before the issue is addressed,” said Trader. “In the three to four days, the hackers have everything they need.”

Why Break In When You Can Walk In?

Fraudsters are bypassing traditional hacking by convincing someone to turn over their user ID and password or other sensitive information. Phishing has become commonplace because it’s effective. The emails are better crafted now so they’re more believable and therefore more dangerous. Even more insidious is spear phishing which targets a particular person and appears to be sent from a person or organization the person knows.

Social engineering also targets a specific person, often on a social network or in a real-world environment. Its purpose is to gain the target’s trust, and walk away with the virtual keys to a company’s network or specific data assets. Some wrongdoers are littering parking lots with thumb drives that contain malware.

Behavioral analytics can help identify and mitigate the damage caused by phishing and social engineering by comparing the authorized user’s behavior in the network and an unauthorized user’s behavior in the network.

Bottom Line

Breaches are bound to happen. The question is whether companies are prepared for them, which means keeping security systems up to date and training employees.

Far too many companies think that hacking is something that happens to other organizations so they don’t allocate the budget and resources they need to effectively manage risks. Hackers love that.

What A Chief Analytics Officer Really Does

As analytics continues to spread out across an organization, someone needs to orchestrate it all. The “best” person for the job is likely a chief analytics officer (CAO) who understands the business, understands analytics, and can help align the two.

The CAO role is a relatively new C-suite position, as is the chief data officer or CDO. Most organizations don’t have both and when they don’t, the titles tend to be used interchangeably. The general distinction is that the CAO focuses more on analytics and its business impact while the CDO is in charge of data management and data governance.

“The new roles are really designed to expand the use of data and expand the questions that data is used to answer,” said Jennifer Belissent, principal analyst at Forrester. “It’s changing the nature of data and analytics use in the organization, leveraging the new tools and techniques available, and creating a culture around the use of data in an organization.”

Someone in your organization may already have some or all of a CAO’s responsibilities and may be succeeding in the position without the title, which is fine. However, in some organizations a C-suite title and capability can help underscore the importance of the role and the organization’s shift toward more strategic data usage.

“The CAO needs to be able to evangelize the use of data, demonstrate the value of data, and deliver outcomes,” said Belissent. “It’s a role around cultural change, change management, and evangelism.”

If you’re planning to appoint a CAO, make sure that your organization is really ready for one because the role can fail if it is prevented from making the kinds of change the organization needs. A successful CAO needs the support of senior management, as well as the authority, responsibility, budget, and people skills necessary to affect change.

One mistake organizations make when hiring a CAO is placing too much emphasis on technology and not enough emphasis on business acumen and people skills.

The making of a CAO

When professional services company EY revisited its global strategy a few years ago, it was clear to its leadership that data and analytics were of growing importance to both its core business and the new services it would provide to clients.

Rather than hiring someone from the outside, EY chose its chief strategy officer, Chris Mazzei, for the role. His charter as CAO was to develop an analytics capability across EY’s four business units and the four global regions in which it operates.

[Want to learn more about CAOs and CDOs, read 12 Ways to Connect Data Analytics to Business Outcomes.]

Part of his responsibility was shaping the strategy and making sure each of the businesses had a plan they were executing against. He also helped expand the breadth and depth of EY’s analytical capabilities, which included acquiring 30 companies in four years.

The acquisitions coupled with EY’s matrixed organizational structure meant lots of analytics tools, lots of redundancies, and a patchwork of other technology capabilities that were eventually rationalized and made available as a service. Meanwhile, the Global Analytics Center of Excellence Mazzei leads was also building reusable software assets that could be used for analytics across the business and for client engagements.

Mazzei and his team also have been responsible for defining an analytics competency profile for practitioners and providing structured training that maps to it. Not surprisingly, his team also works in a consultative capacity with account teams to help enable clients’ analytical capabilities.

“The question is, ‘What is the strategy and how does analytics fit into it?’ It sounds obvious, but few organizations have a clear strategy where analytics is really connected into it across the enterprise and at a business level,” said Mazzei. “You really need a deep understanding of how the business creates value, how the market is evolving, what the sources of competitive differentiation are and how those could evolve. Where you point analytics is fundamentally predicated on having those views.”

Mazzei had the advantage of working for EY for more than a decade and leading the strategy function before becoming the CAO. Unlike a newly-hired CAO, he already had relationships with the people at EY with whom he’d be interfacing.

“Succeeding in this role takes building really trusted relationships in a lot of different parts of the organization, and often at very senior levels,” said Mazzei. “One reason we’ve seen CAOs fail is either because they didn’t have the skills to build those relationships or didn’t invest enough time on it during their tenure.”

Self-Service Analytics Are A Necessity

Lines of business are buying their own analytics solutions because IT is unable to deliver what they need fast enough. If the company has a data team, lines of business can ask for help, but like IT, the data team is faced with address more problems than there are people to solve them.

Smart IT organizations are building a foundation with governance built in. In that way, business users can get access to the data and analytics they need while the company’s assets are protected.

“IT has become more of a facilitator,” said Bob Laurent, VP of product marketing for self-service analytics platform provider Alteryx.  “If they’re able to give people access to data with the proper guardrails, then they’re out of the business of having to do mundane reports week in and week out.”

The shift to self-service analytics is happening across industries because organizations are under pressure to do more with their data and do it faster.

Meanwhile, average consumers have come to expect basic self-service analytics from their banks, insurance companies, brokerage firms, credit card companies, apps, and IoT devices. For an increasing number of businesses, self-service analytics is a necessity.

Higher Education Improves Performance

Colleges and universities are using self-service analytics to improve admission rates, enrollment rates, and more.

As an example, the Association of Schools and Programs of Public Health(ASPPH) built a system that allows its members to upload admissions data, graduate data, salary data, and financial data as well as information about their grants and research contracts. ASPPH verifies and validates the information and then makes the data available via dashboards that can be used for analysis.

“We needed to give them a place to enter their data so they weren’t burdened with reporting which they have to do every year,” said Emily Burke, manager, data analytics at ASPPH.

More than 100 schools and programs for public health are using the system to analyze their data, monitor trends and compare themselves to peers.  They’re also using the system for strategic planning purposes.

“A university will log in and see [their] university’s information and create a peer group that’s just above them in rankings. That way, they can see what marks they need to hit,” said Burke.  “A lot of them are doing that geographically, such as what the application numbers look like in Georgia.”

Drive Value from Self-Service Analytics

The value of self-service analytics is measured by two things: the number of active users, and the business value it provides an organization.  Knowing that, a number of vendors are now offering SaaS products that are easy to use, and don’t require a lot of training.

ASPPH built its own system in 2012. At the time, Burke and her team were primarily focused on the system’s functionality, but it soon became obvious that usability mattered greatly.

“We built this wonderful tool, we purchased the software we needed, we purchased a Tableau server, and then realized that our members really didn’t know how to use it,” said Burke.

Deriving the most value from the system has been a journey for ASPPH, which Burke will explain in detail at Interop during her Data-Driven Decision Making: Empowering Users and Building a Culture of Data session in Las Vegas on Thursday May 18.

If you’re implementing self-service analytics or thinking about it, you’ll be able to see a demonstration of the ASPPH system, hear Burke’s first-hand experiences, and walk away with practical ideas for empowering your users.

 

Why Embedded Analytics Will Change Everything

Analytics are being embedded in all kinds of software. As a result, the ecosystem is changing, and with it so is our relationship to analytics. Historically, analytics and BI have been treated as something separate — we “do”analytics, we’re “doing” ad hoc reporting — but increasingly, analytics are becoming an integral part of software experiences, from online shopping to smart watches and to enterprise applications.

“We’re creating whole industries that are centered around data and analytics that are going to challenge the status quo of every industry,” said Goutham Belliappa, Big Data and Analytics practice leader, for Capgemini North America. “Analytics will become so ubiquitous, we won’t even notice it.  From a business perspective, it’s going to transform entire industries.”

Three drivers are collectively changing how we experience and think about analytics. The first, as previously mentioned, is embedding analytics into all kinds of software. The second is automation, and the third is a shift in the way software is built.

Automation is Fuel

Modern software generates and analyzes more data than ever, and the trend is going to accelerate. The resulting glut of data is outpacing humans’ ability to manage and analyze it, so some analytics necessarily have to be automated, as do some decisions. As a result, analytics has become invisible in some contexts, and it’s going to become invisible in still more contexts soon.

“Frictionless” is a good way to describe what people are striving for in effective user experiences.  Certainly, with more automation and more behind-the-scenes analytics, how we think of analytics will change,” said Gene Leganza, VP & research director at Forrester Research. “We’ll be thinking about the results — do we like the recommendations of this site’s or this app’s recommendation engine or is that one better?  We’ll gravitate towards the services that just work better for us without knowing how they do it.”

That’s not to say that automated analytics should be implemented as black boxes. While humans will apply less conscious thought to analytics because they are embedded, they will still want to understand how decisions were made, especially as those decisions increase in importance, Leganza said. Successful software will not just automate data management and analytics and chose the right combination of microservices to achieve a particular result, it will also be able to explain its path on demand.

Microservices Will Have an Impact

Software development practices are evolving and so is the software that’s being built. In the last decade, monolithic enterprise applications have been broken down into smaller pieces that are now offered as SaaS solutions. Functionality is continuing to become more modular with microservices, which are specific pieces of functionality designed to achieve a particular goal. Because microservices are essentially building blocks, they can be combined in different ways which impacts analytics and vice versa.

Tableau has embraced microservices so its customers can combine B2B tools in a seamless way.  For example, Tableau is now embedded in Salesforce.com, so a sales rep can get insights about a customer as well as the customer details that were already stored in Salesforce.com.

“The more embedded you get, APIs and developer extensions become more relevant because you need more programmability to make [analytics] more invisible, to be seamless, to be part of a core application even though it comes from somewhere else,” said Francois Ajenstat, chief product officer at Tableau.

Software continues to become more modular because modularity provides flexibility. As the pace of business accelerates, software has to be able adapt to changing circumstances quickly and without unnecessary overhead.

“In order to automate more and more actions and to enable adapting to a myriad of conditions, we’ll be having software dynamically cobble together microservices as needed.  The granularity of the services will have to be synced to the patterns in the data.  For the near future, the task will be to make the software flexible enough to adapt to the major patterns we’re seeing,” said Forrester’s Leganza.

What Healthcare Analytics Can Teach The Rest of Us

Healthcare analytics is evolving rapidly. In addition to using traditional business intelligence solutions, there is data flowing from hospital equipment, medical-grade wearables, and FitBits.

The business-related data and patient-related data, sometimes combined with outside data, enable hospitals to triage emergency care patients and treat patients more effectively, which is important. For example, in the U.S., Medicare and Medicaid are embracing “value-based care” which means hospitals are now being compensated for positive outcomes rather than on the number of services they provide, and they’re docked for “excessive” readmissions. Similarly, doctors are increasingly being compensated for positive outcomes rather than the number of patients they see. In other words, analytics is more necessary than ever.

There are a couple of things the rest of us can learn from what’s happening in the healthcare space, and there are some surprises that may interest you. The main message is to learn how to adapt to change, because change is inevitable. So is the rise of machine intelligence.

The Effect of the IoT

Medical devices are becoming connected devices in operating rooms and hospital rooms. Meanwhile, pharmaceutical companies are beginning to connect products such as inhalers to get better insight into a drug’s actual use and effects, and they’re experimenting with medical-grade (and typically application-specific) devices in clinical trials to reduce costs and hassles while getting better insights into a patient’s physical status. Surgeons are combining analytics sometimes with telemedicine to improve the results of a surgical procedures. Slowly but surely, analytics are seeping into just about every aspect of healthcare to lower costs, increase efficiencies, and reduce patient risks in a more systematic way.

One might think devices such as FitBits are an important part of the ecosystem, and from a consumer perspective they are. Doctors are less interested in that data because it’s unreliable, however. After all, it’s one thing for a smartwatch to err in monitoring a person’s heart rate. For a medical-grade device, faulty monitoring could lead to a heart attack and litigation. At this point, doctors are more interested in the fact that someone wears a FitBit because it indicates health consciousness.

Not surprisingly, predictive analytics is important because mitigating or avoiding healthcare-related episodes is preferable to dealing with an episode after the fact. From an IoT perspective, there is a parallel here with equipment and capital asset management. One way to reduce the risk of equipment failure is to compare the performance of a piece of equipment operating in real-world conditions against a virtual representation that is operating normally under the same conditions. Similarly, patient “signatures” will make it easier to spot complications earlier, such as weight gain which is an indicator of congestive heart disease or fluid retention which may indicate the likelihood of a heart attack. Imagine if the same predictive concept were applied in your industry or business.

Machine Intelligence Reveals Insights

“Insights” is a oft-misused term. It is related to analytics but not synonymous with analytics. Insights means new knowledge, and that is precisely the reason why machine learning is gaining traction in the healthcare space. Traditional medical research has been hypothesis-driven. Machine learning doesn’t necessarily have a theory or the same veil of human biases.

Take diabetes research, for example, a machine learning-based research project found that a person who has been hospitalized is at greater risk for subsequent hospitalization, which comes as no surprise to doctors. However, several more interesting factors were unearthed, the biggest surprise of which was flu shots. It turns out the diabetics who did not get flu shots were more likely to be hospitalized and following hospitalization their health became unstable or unmanageable.

The lesson here is one of adaptation: as machine learning becomes mainstream, more of us will have to get comfortable with insights we hadn’t anticipated. In a business context, machine learning may reveal opportunities, risks, or competitive forces you hadn’t considered. In short, more of us will have to embrace an ethos of exploration and learning.

Misunderstood? Try Data Storytelling

Data visualizations help explain complex data, although individuals can and do come to different conclusions nevertheless. It’s a significant problem for data scientists and data analysts, especially when they’re trying to explain something important to business people.

Part of the problem is one’s ability to communicate. Another problem is expecting too much from data visualizations — specifically, the clear communication of an analytical result.

Data storytelling can help, because it goes beyond data visualizations. It also helps individuals think a bit harder about what the data is saying and why.

Are data visualizations dead?

Clearly not. They remain an extremely important part of turning data into insights, but they do have their limitations. The first limitation is that data visualizations don’t always explain the details of what the data is saying and why. Another limitation, as I mentioned earlier, is the possibility of diverse interpretations and therefore diverse conclusions, which, in a business context, can lead to some rather heated and unpleasant debates.

A simple form of data storytelling is adding text to data visualizations to promote a common understanding. Like PowerPoint, however, it’s entirely possible to add so much text or so many bullets to a data visualization that the outcome is even more confusing than it was without the “improvement.”

The same observation goes for infographics. Bright colors, geometric shapes, and “bleeds” (the absence of a border) do little to aid communication when used ineffectively. It’s important to avoid clutter if you want others to understand an important point quickly.

One complaint I hear about using data visualizations alone is that they lack context. Data storytelling helps provide that context.

How to tell a good data story

Humans tend to be storytellers naturally, whether they’re explaining how a car accident happened or why they weren’t home at 7:00, again. However, when it comes to telling data stories, it’s easy to forget what an effective story entails.

An effective story has a beginning, a middle, and an end like a book or a movie. A data story should have those elements, but beware of telling linear stories that are passively consumed. Interactive stories tend to be more effective in this day and age because people have become accustomed to interacting with data at work and at home. In addition, work styles have become more collaborative over time. Allowing audience members to do some of their own exploration enables them ask more informed, if not challenging, questions. In addition, unlike storytelling generally, data story endings tend not to be definite (e.g., “… triumphant at last, they rode off into the sunset”) but rather possibilities.

Data stories are also vulnerable to the same kinds of flaws that detract from blogs, articles, presentations, and books: typos. Make sure to proof your work. Otherwise, you may lose credibility. Also avoid jargon, not only because it’s considered a bad practice, but because it may confuse at least part of the audience, which brings me to another important point: consider the audience,

Data scientists often are criticized for failing to understand their audiences — namely, business people. It’s fine to talk about linear regressions and sample variances among people who understand what they are, how they work, and why they’re important. A business person’s concern is business impact, not particular forms of statistics.

While you’re at it, be careful about language use generally. The word, “product” can mean different things to different people who work at the same company. Bottom line, it’s all about making it easier for other people to understand what you intend to communicate.

« Older posts Newer posts »