Last week, Pivotal’s COO Bill Cook spent some time talking with the Wall Street Journal’s Clint Boulton on why big data is not producing value for most businesses. This conversation was spurred by a recent report from Bain & Co. that cited that only 16 out of 400 businesses are seeing business value from big data today. With those 16 companies representing just 4% of the total surveyed, drawing doubt to predictions that big data is the biggest growth area in IT today.
While the number of companies exacting benefits today may be small, advancements in big data technologies—from speed to scale to cost—are about to swing those numbers higher and live up to the forecasts that big data technology and services market is growing at an incredible 31.7%—or about 7x faster than all of IT.
In the interview, Cook gives us the first clue on what has to change, zeroing in on how companies need to start thinking of big data differently to gain strategic advantages:
“By definition, most enterprises are not there,” Mr. Cook said in an interview at CIO Journal’s offices Wednesday. The traditional software and processes most businesses operate today are not built to accrue value from analytics, he explained. Instead IT staffs must extract data from business applications and analyze it in a separate analytics application. This is a time-consuming process in which data sets become outdated as new data is created.
Mr. Cook said companies must organize all of their business data, figure out what information to retain based on the business challenges they wish to solve, and build software that marries data and analytics. Essentially, companies must make analytics a feature — not a separate function — of the business software, Mr. Cook said. And it should be self-service, meaning business analysts can access the analytics data without asking IT to build them a report. Then businesses can begin thinking about how to use the data to gain competitive advantage. “Analytics is the algorithm by how you interact with your customers,” he said.
Cook suggests companies need to start building analytics into the DNA of business processes, and he is absolutely right.
Still for many companies, big data projects are too complex or too costly to justify. I submit this is just because companies first need to think differently about big data analytics to justify new IT projects.
To start, let’s look closer at Cook’s comment about how data sets become outdated quickly. If data is allowed to get stale, or you need to wait for another group to run your report, the entire process of using the data becomes halted and loses the window for timely action. This is what happened to your business already. Data is not incredibly accessible or relevant at the time, so it hardly gets used.
By placing big data within reach in real-time for the entire organization, your business processes will have amazing possibilities to change. To illustrate, think about old school product development processes. Back when I was doing product management work at Siebel Systems, we went through formal Marketing Requirement Document (MRD) processes once every 9-12 months. This process lasted months, and was planned for well in advance. We would outline our marketing research requirements and take months to get the analysis compiled, many times hired out to market research firms, so we could rationalize customer requirements against engineering budgets. How users used features required focus groups and extensive UE testing. The process was long, and required a lot of work to do properly.
Today, 3 months of planning what you are going to do product-wise will sink you. It defies all the tenets of agile development, where processes demand you deliver results in less than half that time, not a plan to have results.
To keep progress in today’s market, we simply need to compress decision time and focus on action. Big data analytics works to unlock insights hidden previously by data too costly to process. With big data on your side, organizations are able to identify correlations and causal relationships, classify and predict events, identify patterns and anomalies, and infer probabilities, interest, and sentiment in order to establish priorities more quickly. Having data available in a reasonable timeframe compresses the time and improves the accuracy of market research, promoting an investigative approach to data where follow on questions can be answered just as quickly as the initial questions, and decisions can be made faster. Imagine these scenarios:
- Maintenance workers. By connecting data to an Industrial Internet, as GE is setting out to do, maintenance workers servicing jet engines, wind turbines, trains, or other manufacturing equipment can shift from a schedule of proactive maintenance & reactive care to predictive maintenance. Instead of trying to do regular tuning and inspections and hoping to catch problems, the system will help identify the actual work to be done that day, eliminating time wasted on redundant maintenance activities and catching the important ones before a failure happens.
- Product management. Look to some of the big data pioneers like Facebook to see how they’ve created a system within their own product to provide direct user feedback. The entire process of focus groups and market research is negated. Instead, they’ve instrumented the Facebook application to track a large number of signals generated from a user’s actions and those of their friends. Armed with this information, Facebook uses their entire install base as a focus group which has allowed them to quickly evolve a highly personalized user experience for over a billion users, while also creating a new kind of advertising business. This information is baked into how they make every product decision and has helped them to intelligently build a product roadmap that satisfies arguably the biggest install base on the planet.
- Customer care. As Pivotal’s Annika Jimenez described a scenario at the Strata Conference last week, big data can be used to provide critical insights that can help you to retain your install base. In her example, big data can be used to “enable systematic response to user-level likelihood to churn. If your call topic triggers an increase in likelihood to churn, let’s send an email, let’s offer a discount, let’s remove some charges—whatever, but let’s try to lower the likelihood to churn.”
There are more examples, but a common theme here is that each of these scenarios requires big data to be built into real-time processes. It can not be a separate silo in the business, and needs to be accessible to all.
Previously, such as the days when Facebook built out their application, only serious engineering outfits had the skills or the funds to work on this. Most data was analyzed by ETL or batch processes. Near real-time was usually the best anyone could do.
Today, we are in a different world. Software has evolved to deal with the scale of big data, and simultaneously the cloud era has paved a way to commoditize compute power, dramatically reducing costs. Large software companies (like Pivotal) have emerged that are further packaging data solutions to make it easier for companies to deploy big data solutions, and re-architect their data infrastructures to be real-time.
This has helped engineering and data science teams to deploy big data solutions easier, but the next hurdle is getting everyone in the enterprise to use it as a strategic part of their real-time processes. Part of this means harnessing this data to be used within automation programs, such as the customer churn example. Another part is not so scripted though, and needs to be a flexible enough sandbox to find new data insights that can be actioned on immediately or nominated for automation projects.
To do this, forward thinking organizations like the New York Stock Exchange have created a portal that pools data sources, provides access control and security logging, and opens up ad-hoc data investigation for all users. For NYSE, this means that they have the ability to do market surveillance to an open-ended number of data investigations. Using their platform, now available as Data Dispatch, NYSE can identify cyber threats, research customer spending patterns, and service regulatory needs without making IT or a separate data warehousing team the bottleneck. Data consumers are empowered to do their own data investigation and take action within their own domains.
In short, while today big data may only be deployed as part of the process in 4% of Bain & Co.’s 400 enterprise user study, this is because big data has only recently become within reach of the enterprise. And now that its real-time and affordable to open up to the entire organization, we are very much on the precipice of seeing big data adoption accelerate.
Pivotal’s Big Data Products:
- See our post on the release of fast data ingest system GemFire XD and Data Dispatch.
- Find out how Pivotal opened the first of its kind big data Innovation Centre in Singapore.
- Read more about Pivotal HD, Pivotal’s commercially supported distribution of Apache Hadoop that includes the SQL engine HAWQ.
- Check out the product page on Data Dispatch, or watch the video below: