How to process a client’s big data

In a proof of concept (PoC) applying natural language processing and statistical modeling to PC client event logs and IT Help Desk incident reports, Intel IT predicted 20 percent of the incidents that occurred in the following 28 days. Our new ability to proactively, rather than reactively, identify and solve potential client issues before they become widespread promises to deliver significant cost avoidance to the enterprise. In 2013 Intel IT set a target to reduce all reported IT incidents (on clients, servers, and other devices) requiring our attention by 40 percent by the end of the year. Recognizing clients as the primary contributors to overall incidents, we devised a client incident prediction PoC using Intel® Distribution for Apache Hadoop* software (using Hadoop version 2.2). Applying text analytics to millions of client event logs and thousands of client incident reports, we identified correlations enabling us to anticipate and solve client problems before they become widespread. In performing the PoC, we realized a number of accomplishments.

• Developed a big data predictive analytics solution capable of deriving value from the millions of previously rarely used Windows* event records generated daily by 95,000+ client systems

• Applied advanced natural language processing and information retrieval techniques that enabled correlation of machine information (event data) with internal customer information (incident reports)

• Sorted through millions of events and thousands of incidents, achieving 78-percent accuracy in predicting the occurrence of incidents in additional clients

• Created data visualizations that helped IT support staff quickly determine the likelihood, severity, and distribution of a problem and more accurately target fixes and other proactive support Combining data mining and predictive analytics, our client incident prediction solution makes it possible for us to find value in data that was once largely ignored. This new capability will enable us to solve many client issues before they have an impact on user productivity.

Elements of this solution may prove promising for finding new value in other data logs, such as those collected in Intel’s manufacturing, supply chain, marketing, market research, and other operations.

