Strategic Insights and Clickworthy Content Development

Category: Analytics strategy (Page 3 of 4)

HR Use of Social Media Grows, But Is the Data Reliable?

A recent CareerBuilder study of 2,300 hiring managers and human resources professionals shows that more employers are using social media to make hiring and retention decisions.

Drinking, partying and Kim Kardashian-like “break the Internet” posts are clearly unwise for anyone who wants to build a career or keep a job. According to a CareerBuilder press release, among employers who use social media networking sites as a source of information, 54% decided not to hire candidates based on their social media profiles, half of employers check employees’ social media profiles, and more than a third have reprimanded or fired an employee for inappropriate conduct. Seventy percent use social media to screen candidates.

Conversely, 57% are less likely to interview a candidate they can’t find online.

“The majority of employers are looking for information that supports their qualifications on the job [including] a professional persona, and what other people are posting about the candidate,” said Rosemary Haefner, chief human resources officer at CareerBuilder.

Employers want to know how well candidates are able to communicate and whether they exhibit prejudice against persons of a different race, gender or religion. They’re also interested in things candidates have to say about their previous employers, whether they’re lying about their qualifications, and more.

“Post at your own peril,” said Attorney James Goodnow, legal analyst. “Everything you put on Facebook, Instagram and Twitter is fair game for employers and often will have more of an impact on your employment prospects than what you say or do in a job interview. The reason: many employers consider what you post on social media to be the ‘real’ you.”

What if social media forces factual inaccuracies?

LinkedIn is the go-to place to find a person’s professional qualifications and work history, although abbreviated versions of the same information may appear in other social media profiles. What would happen if a person were kicked off one of the networks? Would it matter to the others? What would the person do?

Suppose that Facebook, relying on its famous algorithms questioned your authenticity after years of account activity. True, there is a grievance process. A person can send personally-identifiable documents, hoping to reactivate the account, which reportedly works for some people and not for others. If it doesn’t work, you could try to open another account on the site, but all of the data associated with the original account — email addresses, home town, educational background, and the like — might not be permitted under the new account. You become an unperson in the social media world.

Aside from the potential HR issues, another question is whether such an incident affects a person’s credit score.

Why social media doesn’t score at FICO

FICO looks at thousands of variables, but it tends to use less than a hundred when calculating a person’s credit scores. Apparently, the use of more variables leads to diminishing returns.

“Social media does not play into FICO scores in the U.S.,” said Sally Taylor-Shoff, Scores Vice President at FICO. “In the U.S., lenders use FICO scores to make lending decisions. Lending decisions are regulated, so the use of social media data will not meet the compliance requirements most lenders have to deal with.”

Past payment history is the most predictive indicator of whether a person will repay a loan. If the person doesn’t have a loan history, then FICO uses that person’s payment history of rent and cell phone bills, for example.

“We use a six-point test to evaluate whether that data should be used: whether it meets regulatory requirements, whether it has enough depth and breadth, enough scope and consistency in the data, and whether it’s predictive,” said Taylor-Shoff.

Accuracy also matters.

“It can’t be something consumers can just use or manipulate,” said Taylor-Shoff. “Credit data comes from creditors.”

Even though there may be some inaccuracies, lenders are legally required to have a grievance process consumers can pursue, and there’s no shortage of consumer protection information about what an aggrieved consumer can do.

In the social media world, bot decisions may be final, and there’s not necessarily a lot of transparency.

Why Advanced Analytics Is in Your Future

Basic reporting and analytics are now competitive table stakes across industries. As 2020 approaches, more companies are using sophisticated algorithms to drive higher levels of efficiency, reduce costs and risks, drive additional revenue, improve customer experience and more. If organizations want to become truly agile in today’s dynamic business environment, they have to continually improve their operations and evolve the ways they’re using analytics.

“If you’re not using advanced analytics yet, you’re in trouble,” said Bill Franks, chief analytics officer at the International Institute for Analytics (IIA). “Twenty years ago, if you were doing some type of analytics you had competitive advantage. Now if you’re not doing analytics, you’re falling behind. If companies don’t push to adopt the new stuff, it’s going to become a problem over time.”

What Is Advanced Analytics?

Advanced analytics, like data science, lacks a standard description, although characteristically, it involves prediction. Deep learning, neural networks, cognitive computing, and AI come to mind because the algorithms have capabilities traditional input/output systems just can’t provide.

“What’s commercially possible to do has expanded significantly,” said Chris Mazzei, chief analytics officer at professional services company EY. “Decreasing technology costs and the explosion of data changes what’s possible to do with analytics, and [the possibilities] are growing every year. That, combined with competitive pressures means if you’re not looking for ways to reduce costs, enhance customer experience, create new products and services, if you don’t want to manage risks radically different and better, you’re in trouble.”

Most companies start with basic analytics and then increase the level of sophistication as they begin to realize the limitations of their existing systems. Disruptors are an exception because they use advanced analytics early on in an attempt to outthink and outmaneuver the existing players.

Whether your company is trying to compete more effectively or just stay relevant, advanced analytics is in your future, sooner or later. The question is whether your company will lead or follow. Either way, now is the time to learn all you can about advanced analytics so you understand what benefits it can drive for your company.

Even Small Businesses Should Care

Not so long ago, only large companies could afford the tools and specialists necessary to take advantage of advanced analytics. However, as more capabilities are made available through cloud-based services and as more of the complexity is abstracted, more businesses are able to advantage of advanced analytics without spending millions of dollars and hiring data scientists.

For example, lawn care aggregator site LawnStarterstarted using prescriptive analytics about two years after the founders defined the business concept. The initial goal was to decrease customer churn.

“We have a customer risk model and a provider risk,” said Ryan Farley, co-founder or LawnStarter. “We have thousands of lawn care providers in our system and the number of jobs they have ranges from tens to hundreds. Sometimes they take on too much. Before we had predictive analytics, we had to wait for the problem to become obvious.” Now LawnStarter is able to operate in a proactive way rather than a reactive way.

In all fairness, Farley wasn’t a typical entrepreneur. Previously, he worked for Capital One, which has been using predictive analytics since the 1990s to improve the ROI of its direct mail campaigns. When LawnStarter was founded, the founders wanted to do “cool stuff” rather than follow the traditional method of starting a company, building a product, and writing code. Fortunately, LawnStarter and machine learning platform provider DataRobot were part of the same Techstars accelerator program, so LawnStarter became one of DataRobot’s beta customers.

“We were like, ‘This is so cool! There’s predictive capabilities in our data sets!” said Farley. “We started out doing it for fun, but then we realized there was actually business to be had there. Shortly thereafter, we started investing in the data infrastructure to where we can compile our different data together and make sure everything we’re collecting is consistent and accurate.

Analytics Ensure Safety in LA and White Plains

Security is top of mind when city CIOs think about the types of analytics they need. However, analytics is also enabling them to improve internal processes and the experience citizens and businesses have.

The City of White Plains , New York stores its data in a data center to ensure security. The City of Los Angeles has a hybrid implementation because it requires cloud-level scalability. In LA, 240 million records from 37 different departments are ingested every 24 hours just for cybersecurity purposes, according to the city’s CIO Ted Ross.

“We didn’t start off at that scale but [using the cloud] we’re able to perform large amounts of data analysis whether it’s cybersecurity or otherwise,” Ross said.

He thinks it’s extremely important that organizations understand their architecture, where the data is, and how data gets there and then put the appropriate security measures in place so they can leverage the benefits of the cloud without being susceptible to security risks.

“If you’re not doing analytics and you’re moving [to the cloud], it’s easy to think it will change your world and in certain [regards] it may. The reality is, you have to go into it with both eyes open and understand what you’re trying to accomplish and have realistic expectations about what you can pursue,” said Ross.

White Plains is on a multi-year journey with its analytics, as are its peers because connecting the dots is a non-trivial undertaking.

“Municipalities have a lot of data, but they move slowly,” said White Plains CIO Michael Coakley. “We have a lot of data and we are trying to get to some of the analytics [that make sense for a city].”

Departments within municipalities still tend to operate in silos. The challenge is eliminating those barriers so data can be used more effectively.

“It’s getting better. It’s something we’ve been working on the for the last few years which is knocking down the walls, breaking down the silos and being able to leverage the data,” said Coakley. “It’s for the betterment of citizens and businesses.”

Connecting data from individual departments improves business process efficiencies and alleviates some of the frustrations citizens and businesses have had in the past.

“If you’re a small business owner who bought a plot of land in White Plains and wants to [erect] a building, you could go to the department of Public Works to get a permit, the Building Department to get a permit and the Planning Department to get a permit and none of those departments know what you’re talking about,” said Coakley. “With the walls being broken down and each department being able to use the data, it makes the experience better for the business or home owner.”

The city is also connecting some of its data sets with data sets of an authority that operates within the city, but is not actually part of the city.

“There’s a reason for their autonomy, but it’s important to start the dialog and show them [how connecting the data sets] will benefit them,” said Coakley. “Once you show the department what they can provide for you, and ensure it’s not going to compromise the integrity of their data, they usually come along. They see the efficiencies it creates and the opportunities it creates.”

In those discussions, it becomes more obvious what kind of data can be generated when the data sets are used and shared and what kind of analytics can be done. The interconnection of the data sets creates the opportunity to get insights that were not previously possible or practical when the data generated in a department stayed in that department.

White Plains is trying to connect data from all of its departments so it can facilitate more types of analytics and further improve the services it provides citizens and businesses. However, cybersecurity analytics remain at the top of the list.

“Cybersecurity is number one,” said Coakley. “We have to worry about things like public safety, which is not just police, fire, emergency, public works, facilities, water, electrical, and engineering. There’s a lot of data and the potential for a lot of threats.

Sports Teams Embrace Analytics and the IoT

In my last sports-related post, I explained how the National Hockey League (NHL) is using IoT devicesto provide the league with deeper insights about the players and the game while immersing fans in new stats and spectator experiences. The NHL is not alone. In fact, IoT devices are finding their way into all kinds of sports for the benefit of leagues, players, and fans.

For example, the National Football League has been placing sensors in player’s shoulder pads to track their location, speed, and distance traveled. Last year, it experimented with sensors in footballs to track their motion, including acceleration, distance, and velocity. That data is being sold to broadcast partners.

Meanwhile, young football players who hope to play the game professionally are tracking themselves hoping to become more attractive recruiting targets.

NBA Teams Score with Insights

The Golden Gate Warriors and Miami Heat are getting some interesting datafrom wearables and other sensors that track the movement of players and the basketball used in a game. Now it’s possible to analyze how players shoot, how high they jump, and the speed at which the ball travels, among other things. One thing that trips me up about it is how some of that data is visualized by the coach.

Picture this: The player clips a device to his shorts or wears the device on his wrist so his coach can understand the trajectory of the ball and get statistics about a player’s movements on a cell phone. The new insights help coaches and their teams understand the dynamics of the sport better, but I wonder how practical Basketball by Smartphone App is, given the speed at which the game is played.

Sensors placed somewhere on the players and in the basketball also provide information about players’ movements on the court over time. The visualization looks a like a plate of spaghetti, but within that are patterns that reveal players’ habits, such as the area of the court the player tends to favor.

Beyond Moneyball

Former Oakland A’s general manager Billy Beane is considered the father of sports analytics because in 1981 he was the first to change the makeup of a team and how a team played the sport based on what the numbers said. This is commonly known as “Moneyball” (thanks to the book and movie) or “Billyball.”

One interesting insight was base time. The more time a player spends on-base, the more likely that player will walk to first base rather than strike out.

However, Beane’s early experimentation also demonstrated that numbers aren’t everything. He was fired the next year (in 1982) for overworking pitchers. Stated another way, the stellar turnaround year was not followed by a similarly strong year.

These days, sensors are enabling Major League Baseball (MLB) statistics 2.0. For example, sensors in baseball bats provide insights about the speed and motion of a swing and the point of impact when a ball hits the bat. In the dugouts, coaches and players can get access to all kinds of insights via an iPad during the game. The insights enable them to fine-tune the way they play against the opposing team. It’s also possible to track the movements of a specific player.

NHL and Fans Score with Predictive Analytics

If you’re a hockey fan, you’ve probably noticed that the statistics are more comprehensive than they once were. That’s not happening by accident.

The National Hockey League (NHL) uses predictive analytics to learn more about fans, improve its direct marketing efforts, track players’ performance on the ice, and improve fan engagement.

Making an IoT Play

During the 2015 All-Star game, sensors were embedded inside pucks and players’ jersey collars which provided insight into where the puck and players were, how fast they were moving, puck trajectory, players’ time on ice and more.

The information was used during replays to better explain how a particular outcome came about. Fans were able to visualize the paths players and pucks had taken, giving them more insight into players’ performance. Experimentation continued at the World Cup of Hockey 2016, which was substantially the same thing — tracking pucks and players.

The key to winning a hockey match is puck possession. If Team A possesses the puck longer than Team B, Team A will score more points over time.

The information derived from the devices, particularly the jerseys, can be used for training purposes and to minimize injuries.

A Data Scientist Predicted Winners and Losers

A couple of years ago, the NHL worked with a data scientist who reviewed historical data including player statistics and team statistics over several seasons. When he crunched the data, he found that there are certain statistics and factors that, over time, can help predict team performance on the ice, especially in the playoffs.

A digital banking conversion is hard work. The right vendor can make it smoother and less stressful. Here are seven important considerations when choosing a vendor.

Thirty-seven different factors were weighted in certain ways and applied to the 16 teams that started the playoffs in April 2015. The goal was to predict how the playoff teams would do when playing against each other. And, as the rounds progressed, how the teams would perform in new matchups.

The results were very interesting. The data scientist was able to predict at the start of the season that the Chicago Blackhawks would win the Stanley Cup. He also was able to predict which team would win each playoff game, most of the time.

“What’s interesting about that is our sport is a pretty unpredictable sport,” said Chris Foster, director of Digital Business Development at National Hockey League, in an interview. “The action is so fast, goals happen rather infrequently and a lot of it has to do with a puck bouncing here or a save there. It’ very fast action that is sometimes hard to predict, but it just shows that data, when properly analyzed, and really smart models are put around it, that predictive analytics can tell you a lot about how a team is going to perform.”

How Cybersecurity Analytics Are Evolving

As the war between the black hats and white hats continues to escalate, cybersecurity necessarily evolves. In the past, black hats were rogue individuals. Now they’re hactivists, crime groups, and hackers backed by nation states.

“Hackers have gotten a lot more sophisticated,” said Sanjay Goel, a professor in the School of Business at University of Albany. “It used to be they’d break into networks, do some damage, and get out. Now they have persistent attacks and targeted execution.”

Hackers are automating attacks to constantly search for vulnerabilities in networks. Meanwhile, fraudulent communications are getting so sophisticated, they’re fooling even security-aware individuals. Analytics can help, but nothing is a silver bullet.

Moats Are Outdated

Organizations used to set up perimeter security to keep hackers from breaching their networks. Since that didn’t work, firewalls were supplemented with other mechanisms such as intrusion detection systems that alert security professionals to a breach and honey pots that lure hackers into a place where they can be monitored and prevented from causing damage.

Those tools are still useful, but they have necessarily been supplemented with other methods and tools to counter new and more frequent attacks. Collectively, these systems monitor networks, traffic, user behavior, access rights, and data assets, albeit at a grander scale than before, which has necessitated considerable automation. When a matter needs to be escalated to a human, analytical results are sent in the form of alerts, dashboards, and visualization capabilities.

“We really need to get away from depending on a security analyst that’s supposed to be watching a dashboard and get more into having fully-automated systems that take you right to remediation. You want to put your human resources at the end of the trail,” said Dave Trader, chief security officer at IT services company GalaxE.Solutions.

Predictive analytics analyzes behavior that indicates threats, vulnerabilities, and fraud. Slowly, but surely, cybersecurity budgets, analytics, and mindsets are shifting from prevention to detection and remediation because enterprises need to assume that their networks have been breached.

“All the hackers I know are counting on you not taking that remedial step, so when there’s a vulnerability and it’s a zero-day attack, the aggregator or correlators will catch it and then it will go into a ticket system so its three to four days before the issue is addressed,” said Trader. “In the three to four days, the hackers have everything they need.”

Why Break In When You Can Walk In?

Fraudsters are bypassing traditional hacking by convincing someone to turn over their user ID and password or other sensitive information. Phishing has become commonplace because it’s effective. The emails are better crafted now so they’re more believable and therefore more dangerous. Even more insidious is spear phishing which targets a particular person and appears to be sent from a person or organization the person knows.

Social engineering also targets a specific person, often on a social network or in a real-world environment. Its purpose is to gain the target’s trust, and walk away with the virtual keys to a company’s network or specific data assets. Some wrongdoers are littering parking lots with thumb drives that contain malware.

Behavioral analytics can help identify and mitigate the damage caused by phishing and social engineering by comparing the authorized user’s behavior in the network and an unauthorized user’s behavior in the network.

Bottom Line

Breaches are bound to happen. The question is whether companies are prepared for them, which means keeping security systems up to date and training employees.

Far too many companies think that hacking is something that happens to other organizations so they don’t allocate the budget and resources they need to effectively manage risks. Hackers love that.

Self-Service Analytics Are A Necessity

Lines of business are buying their own analytics solutions because IT is unable to deliver what they need fast enough. If the company has a data team, lines of business can ask for help, but like IT, the data team is faced with address more problems than there are people to solve them.

Smart IT organizations are building a foundation with governance built in. In that way, business users can get access to the data and analytics they need while the company’s assets are protected.

“IT has become more of a facilitator,” said Bob Laurent, VP of product marketing for self-service analytics platform provider Alteryx.  “If they’re able to give people access to data with the proper guardrails, then they’re out of the business of having to do mundane reports week in and week out.”

The shift to self-service analytics is happening across industries because organizations are under pressure to do more with their data and do it faster.

Meanwhile, average consumers have come to expect basic self-service analytics from their banks, insurance companies, brokerage firms, credit card companies, apps, and IoT devices. For an increasing number of businesses, self-service analytics is a necessity.

Higher Education Improves Performance

Colleges and universities are using self-service analytics to improve admission rates, enrollment rates, and more.

As an example, the Association of Schools and Programs of Public Health(ASPPH) built a system that allows its members to upload admissions data, graduate data, salary data, and financial data as well as information about their grants and research contracts. ASPPH verifies and validates the information and then makes the data available via dashboards that can be used for analysis.

“We needed to give them a place to enter their data so they weren’t burdened with reporting which they have to do every year,” said Emily Burke, manager, data analytics at ASPPH.

More than 100 schools and programs for public health are using the system to analyze their data, monitor trends and compare themselves to peers.  They’re also using the system for strategic planning purposes.

“A university will log in and see [their] university’s information and create a peer group that’s just above them in rankings. That way, they can see what marks they need to hit,” said Burke.  “A lot of them are doing that geographically, such as what the application numbers look like in Georgia.”

Drive Value from Self-Service Analytics

The value of self-service analytics is measured by two things: the number of active users, and the business value it provides an organization.  Knowing that, a number of vendors are now offering SaaS products that are easy to use, and don’t require a lot of training.

ASPPH built its own system in 2012. At the time, Burke and her team were primarily focused on the system’s functionality, but it soon became obvious that usability mattered greatly.

“We built this wonderful tool, we purchased the software we needed, we purchased a Tableau server, and then realized that our members really didn’t know how to use it,” said Burke.

Deriving the most value from the system has been a journey for ASPPH, which Burke will explain in detail at Interop during her Data-Driven Decision Making: Empowering Users and Building a Culture of Data session in Las Vegas on Thursday May 18.

If you’re implementing self-service analytics or thinking about it, you’ll be able to see a demonstration of the ASPPH system, hear Burke’s first-hand experiences, and walk away with practical ideas for empowering your users.

 

Why Embedded Analytics Will Change Everything

Analytics are being embedded in all kinds of software. As a result, the ecosystem is changing, and with it so is our relationship to analytics. Historically, analytics and BI have been treated as something separate — we “do”analytics, we’re “doing” ad hoc reporting — but increasingly, analytics are becoming an integral part of software experiences, from online shopping to smart watches and to enterprise applications.

“We’re creating whole industries that are centered around data and analytics that are going to challenge the status quo of every industry,” said Goutham Belliappa, Big Data and Analytics practice leader, for Capgemini North America. “Analytics will become so ubiquitous, we won’t even notice it.  From a business perspective, it’s going to transform entire industries.”

Three drivers are collectively changing how we experience and think about analytics. The first, as previously mentioned, is embedding analytics into all kinds of software. The second is automation, and the third is a shift in the way software is built.

Automation is Fuel

Modern software generates and analyzes more data than ever, and the trend is going to accelerate. The resulting glut of data is outpacing humans’ ability to manage and analyze it, so some analytics necessarily have to be automated, as do some decisions. As a result, analytics has become invisible in some contexts, and it’s going to become invisible in still more contexts soon.

“Frictionless” is a good way to describe what people are striving for in effective user experiences.  Certainly, with more automation and more behind-the-scenes analytics, how we think of analytics will change,” said Gene Leganza, VP & research director at Forrester Research. “We’ll be thinking about the results — do we like the recommendations of this site’s or this app’s recommendation engine or is that one better?  We’ll gravitate towards the services that just work better for us without knowing how they do it.”

That’s not to say that automated analytics should be implemented as black boxes. While humans will apply less conscious thought to analytics because they are embedded, they will still want to understand how decisions were made, especially as those decisions increase in importance, Leganza said. Successful software will not just automate data management and analytics and chose the right combination of microservices to achieve a particular result, it will also be able to explain its path on demand.

Microservices Will Have an Impact

Software development practices are evolving and so is the software that’s being built. In the last decade, monolithic enterprise applications have been broken down into smaller pieces that are now offered as SaaS solutions. Functionality is continuing to become more modular with microservices, which are specific pieces of functionality designed to achieve a particular goal. Because microservices are essentially building blocks, they can be combined in different ways which impacts analytics and vice versa.

Tableau has embraced microservices so its customers can combine B2B tools in a seamless way.  For example, Tableau is now embedded in Salesforce.com, so a sales rep can get insights about a customer as well as the customer details that were already stored in Salesforce.com.

“The more embedded you get, APIs and developer extensions become more relevant because you need more programmability to make [analytics] more invisible, to be seamless, to be part of a core application even though it comes from somewhere else,” said Francois Ajenstat, chief product officer at Tableau.

Software continues to become more modular because modularity provides flexibility. As the pace of business accelerates, software has to be able adapt to changing circumstances quickly and without unnecessary overhead.

“In order to automate more and more actions and to enable adapting to a myriad of conditions, we’ll be having software dynamically cobble together microservices as needed.  The granularity of the services will have to be synced to the patterns in the data.  For the near future, the task will be to make the software flexible enough to adapt to the major patterns we’re seeing,” said Forrester’s Leganza.

What Healthcare Analytics Can Teach The Rest of Us

Healthcare analytics is evolving rapidly. In addition to using traditional business intelligence solutions, there is data flowing from hospital equipment, medical-grade wearables, and FitBits.

The business-related data and patient-related data, sometimes combined with outside data, enable hospitals to triage emergency care patients and treat patients more effectively, which is important. For example, in the U.S., Medicare and Medicaid are embracing “value-based care” which means hospitals are now being compensated for positive outcomes rather than on the number of services they provide, and they’re docked for “excessive” readmissions. Similarly, doctors are increasingly being compensated for positive outcomes rather than the number of patients they see. In other words, analytics is more necessary than ever.

There are a couple of things the rest of us can learn from what’s happening in the healthcare space, and there are some surprises that may interest you. The main message is to learn how to adapt to change, because change is inevitable. So is the rise of machine intelligence.

The Effect of the IoT

Medical devices are becoming connected devices in operating rooms and hospital rooms. Meanwhile, pharmaceutical companies are beginning to connect products such as inhalers to get better insight into a drug’s actual use and effects, and they’re experimenting with medical-grade (and typically application-specific) devices in clinical trials to reduce costs and hassles while getting better insights into a patient’s physical status. Surgeons are combining analytics sometimes with telemedicine to improve the results of a surgical procedures. Slowly but surely, analytics are seeping into just about every aspect of healthcare to lower costs, increase efficiencies, and reduce patient risks in a more systematic way.

One might think devices such as FitBits are an important part of the ecosystem, and from a consumer perspective they are. Doctors are less interested in that data because it’s unreliable, however. After all, it’s one thing for a smartwatch to err in monitoring a person’s heart rate. For a medical-grade device, faulty monitoring could lead to a heart attack and litigation. At this point, doctors are more interested in the fact that someone wears a FitBit because it indicates health consciousness.

Not surprisingly, predictive analytics is important because mitigating or avoiding healthcare-related episodes is preferable to dealing with an episode after the fact. From an IoT perspective, there is a parallel here with equipment and capital asset management. One way to reduce the risk of equipment failure is to compare the performance of a piece of equipment operating in real-world conditions against a virtual representation that is operating normally under the same conditions. Similarly, patient “signatures” will make it easier to spot complications earlier, such as weight gain which is an indicator of congestive heart disease or fluid retention which may indicate the likelihood of a heart attack. Imagine if the same predictive concept were applied in your industry or business.

Machine Intelligence Reveals Insights

“Insights” is a oft-misused term. It is related to analytics but not synonymous with analytics. Insights means new knowledge, and that is precisely the reason why machine learning is gaining traction in the healthcare space. Traditional medical research has been hypothesis-driven. Machine learning doesn’t necessarily have a theory or the same veil of human biases.

Take diabetes research, for example, a machine learning-based research project found that a person who has been hospitalized is at greater risk for subsequent hospitalization, which comes as no surprise to doctors. However, several more interesting factors were unearthed, the biggest surprise of which was flu shots. It turns out the diabetics who did not get flu shots were more likely to be hospitalized and following hospitalization their health became unstable or unmanageable.

The lesson here is one of adaptation: as machine learning becomes mainstream, more of us will have to get comfortable with insights we hadn’t anticipated. In a business context, machine learning may reveal opportunities, risks, or competitive forces you hadn’t considered. In short, more of us will have to embrace an ethos of exploration and learning.

Common Biases That Skew Analytics

How do you know if you can trust analytical outcomes? Do you know where the data came from? Is the quality appropriate for the use case? Was the right data used? Have you considered the potential sources and effects of bias?

All of these issues matter, and one of the most insidious of them is bias because the source and effects of the bias aren’t always obvious. Sadly, there are more types of bias than I can cover in this blog, but following are a few common ones.

Selection bias

Vendor research studies are a good example of selection bias because several types of bias may be involved.

Think about it: Whom do they survey? Their customers. What are the questions? The questions are crafted and selected based on their ability to prove a point. If the survey reveals a data point or trend that does not advance the company agenda, that data point or trend will likely be removed.

Data can similarly be cherry-picked for an analysis. Different algorithms and different models can be applied to data, so selection bias can happen there. Finally, when the results are presented to business leaders, some information may be supplemented or withheld, depending on the objective.

This type of bias, when intentional, is commonly used to persuade or deceive. Not surprisingly, it can also undermine trust. What’s less obvious is that selection bias sometimes occurs unintentionally.

Confirmation bias

A sound analysis starts with a hypothesis, but never mind that. I want the data to prove I’m right.

Let’s say I’m convinced that bots are going to replace doctors in the next 10 years. I’ve gathered lots of research that demonstrates the inefficiencies of doctors and the healthcare system. I have testimonials from several futurists and technology leaders. Not enough? Fine. I’ll torture as much data as necessary until I can prove my point.

As you can see, selection bias and confirmation bias go hand-in-hand.

Outliers

Outliers are values that deviate significantly from the norm. When they’re included in an analysis, the analysis tends to be skewed.

People who don’t understand statistics are probably more likely to include outliers in their analysis because they don’t understand their effect. For example, to get an average value, just add up all the values and divide by the sum of the individuals being analyzed (whether that’s people, products sold, or whatever). And voila! End of story. Except it isn’t…

What if 9 people spent $100 at your store in a year, and the10th spent $10,000? You could say that your average customer spend per year is $1,090. According to simple math, the calculation is correct. However, it would likely be unwise to use that number for financial forecasting purposes.

Outliers aren’t “bad” per se, since they are critical for such use cases as cybersecurity and fraud prevention, for example. You just have to be careful about the effect outliers may have on your analysis. If you blindly remove outliers from a dataset without understanding them, you may miss an important indicator or the beginning of an important trend such as an equipment failure or a disease outbreak.

Simpson’s Paradox

Simpson’s Paradox drives another important point home: validate your analysis. When Simpson’s Paradox occurs, trends at one level of aggregation may reverse themselves at different levels of aggregation. Stated another way, datasets may tell one story, but when you combine them, they may tell the opposite story.

A famous example is a lawsuit that was filed against the University of California at Berkeley. At the aggregate level, one could “prove” more men were accepted than women. The reverse proved true in some cases at the departmental level.

« Older posts Newer posts »