Big Data: “60M 5-Drawer File Cabinets per Hour”

Brian Wood Blog

The analogy used in the article below really drives home the visualization of simply how much data we’re talking about when we talk about big data.

Wal-Mart — granted, the largest retailer on the planet — generates the equivalent of 60 million 5-drawer file cabinets’ worth of data EVERY HOUR.

[ And using 3rd-grade math, that’s the equivalent of 1 million 5-drawer file cabinets’ worth of data EVERY MINUTE. Wow. ]

What are retailers like Wal-Mart doing with this data? Read on (and note: any article with “creepy” in it is bound to be somewhat entertaining).

Summary article by Tim McElligott in FierceBigData and original article by Daniel Burrus in The Huffington Post.

Emphasis in red added by me.

Brian Wood, VP Marketing


Wal-Mart, others seeing results from big data initiatives

Wal-Mart was an early poster child for people looking for big data in action and for results. It was not only a high-profile early adopter, but fully embraced the research and development side of the business in its Wal-Mart Labs. Target, too, got a lot of attention, but mostly for having outed a pregnant teenager using its new analytic skills. But Wal-Mart and other retailers have stayed the course and are now seeing some rewards.

The Huffington Post profiled companies this week that are seeing results from their big data initiatives. Some of their methods are a little creepy, such as putting cameras in the eyes of mannequins to record who looks at them and for how long. Others look at which people pick up a product and which walk away, whether they are tall, short, fat, skinny, young old, etc. Creepy, but effective. The Post says the payoff is starting to happen already.

It identified two electronics chains, The Source and Charlie Brown, which used analytics to identify a purchasing shift toward more upscale electronics items. They adjusted their inventory and increased sales by 40 percent. What made this a big data solution rather than good old-fashioned inventory control was that the businesses saw the shift and adjusted in real time, according to the article.

Hotels also maintain a lot of data about their customers and have begun putting it to better use. The InterContinental Hotel Group moved to a big data solution that analyzes both unstructured and structured data in real-time and can now correlate up to 650 variables from different sources, even competitors. It rebuilt its reservation system around this new capability and says it can now personalize each web experience for customers and ensure a high conversion rate, which drives growth in their booking channel.

In Australia, a CIO survey showed that 84 percent of mid-market businesses had either deployed big data or were in the process of doing so in the next year. The majority of them cited the ability to execute faster decision making at their top driver.

And on the social side of things, Alabama’s Mobile County Public School System had a 48 percent dropout rate in 2008 and turned to IBM’s (NASDAQ: IBM) big data analytics to identify at risk indicators. After a couple of fits and starts, the district was able to meld data from its 95 schools to get a better sense of both individual students’ “back stories,” as well as indicators for potential dropouts. The district has since improved its graduation rate–to 70 percent–and improved test scores across the board.


Big Data Is Already Producing Big Results

When we hear the phrase “big data,” we have to ask ourselves, “How big is big? What are we really talking about?”

Let’s take one of the largest retailers–Walmart. Start by visualizing one five-drawer filing cabinet. Now, think of a room filled with 60 million five-drawer file cabinets. That’s how much data comes from all of the Walmart stores every hour. And as retailers install more sensors to add advanced predictive analytics to real-time sales and customer behavior, that figure of 60 million filing cabinets worth of data every hour is going to increase. For example, retailers are beginning to use mannequins with cameras in their eyes so they can see who’s looking at them and whether they’re male or female, pregnant or not, thin or heavy, etc. And that’s just one little data point.

In the past, I’ve written about how we’re using cameras in the stores, not just for security but also to create actionable data on where people go, when they leave, where they stay, what they buy, and what they just look at and move on. All of this is creating an ever-increasing tsunami of data–so much so that we have to realize it is, indeed, big data, and getting bigger.

What’s even more amazing is that if we look back 20 years, from 1993 to a few years ago, the total amount of data that went over the Internet in a year is now how much data that goes over the Internet in one second. It’s important to note that most of that increase has been in the last few years due to the exponential, and predictable, advances in processing power, storage, and bandwidth. And, by the way, a hard trend certainty is that the amount of data is going to increase as data gets bigger and our desire to get real-time high-speed analytics increases.

So we have to ask ourselves, “What are companies doing with all this data? Is it paying off already, or do we have to wait for the payoff?” The answer is, “The payoff is starting to happen already.”

For example, I was in Canada recently and visited a couple of electronics chains, one called The Source and the other called Charlie Brown. These stores are using real-time analytics of sales to make decisions.

They noticed that in all of their stores a purchasing shift had taken place; several specific upscale electronics items that started at about $650 were selling a lot more than the lower price models, which were in the $150 price range. So they started filling the shelves with more of the higher-priced merchandise and greatly reduced the number of lower priced models. Sales in the categories that they made those changes in surged 40% in a very short amount of time.

A 40% surge is not bad. And thanks to the real-time data, they were able to know exactly which products they needed more of. There was no guessing involved. They could zero right in on a shift in purchasing and make the changes pay off immediately.

They also looked at their lower-end items and noticed which specific items were decreasing in volume, so they discontinued them completely. Again, overall sales and profitability rose, because profitability is based not only on the items that are selling, but also the merchandise that isn’t selling and taking up space in inventory.

Of course, retailers have been doing this for a long time–deciding which products to remove from inventory and which to increase. But it wasn’t done in real-time. It wasn’t done with the pinpoint accuracy that we have today. Thanks to the data that we’re getting in from various sources, retailers can make better decisions faster and increase their bottom line.

Ask yourself: How could we use big data and high-speed analytics to make better decisions faster so that we could gain competitive advantage and drive increase profitability?