Big data and dark data: Balancing the costs and benefits

Shane Dave D. Tanguin

Big data is starting to become a cliché among business executives, given that almost everyone is now leveraging big data in decision making. “Big data” was defined in 2012 by Gartner (a global research and advisory firm) as “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.”

The term is often used to refer to predictive analytics or other methods of extracting value from data and information. What is often left out is its twin subset — dark data. Gartner coined the term and defines “dark data” as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.”

The digital world produces information in unprecedented proportions. Based on a study by Statista in May 2018, about 47 zetta bytes (1 zetta byte is about 1 trillion giga bytes) of data are expected to be generated by 2020. This number grows to 163 zettabytes in 2025 – almost 3.5 times in a span of five years! To put in perspective how exponential the growth of data worldwide is, only 2 zettabytes were generated in 2010. While structured information can be consumed for analysis out of the ocean of big data, portions of unstructured information, the dark data, will remain untapped.

The growing breadth of available data and the use of big data in business decisions and applications would mean commensurate growth in the investment needed to make sense out of the ocean of information. Revenue from big data and business analytics worldwide, according to a study conducted by Statista in August 2018, amounted to $149 billion in 2017 and is expected to reach $186 billion in 2019. Revenue from these businesses is expected to grow steadily at 12% year on year to about $260 billion in 2022. Clearly, more and more investment is going to leverage the power of big data and harness the benefits it brings to decision making. Investmenting in the right places also helps in maximizing yields.

Let us look into an industry where big data and data analytics have made a massive impact — the restaurant business. Gathering information ranging from customer demographics, behavioral data and shared customer interests, restaurant owners can develop smart and specific marketing activities for targeted customers. Customer profiles and point-of-sale information also help in developing best practices in maintaining on-time delivery, menu enhancement, customer segmentation, streamlining operations and improving customer experience.

A lot has been developed in this industry and big data has had a significant influence in effecting these changes. However, where does dark data go?

Big data is used in the practical world starting from determining what objective needs to be met — then almost instantaneously, followed by determining the what, why, how, where and when. This is where it gets tricky. One can start defining what they need and then look for it in the big data or start from the big data to see what it offers then see what benefits to explore. In either approach, handling volumes of big data may prove to be costly both on a technological and people resources level leaving no space for investment in harnessing dark data (i.e., emails, printed reports/statistics, hard copy files, CCTV footages among others).

Let’s take as an example a small restaurateur who aims to solve the single biggest issue identified by customer survey feedback — long waiting queues before waiters are available to take orders. Structured data were gathered to profile customers from the moment they enter the restaurant until an order is taken — demographics, time of day information, volume of customers, menu listing, number of waiters and ordering time. The restaurateur analyzed all this information and developed a streamlined menu and added waiters on identified shifts where customers are expected to peak. The expectation was to have the ordering time drop significantly and waiters will have a quicker turnaround for taking orders.

However, while the changes all made sense, there was no noticeable drop in ordering time. This made the restaurateur go back to the drawing board and prompted a check on how ordering was done in the past. The restaurant’s CCTV footage was reviewed and customer behavior was observed comparing the order-taking sequence in the past and present. The restaurateur noted that in recent footage, an average of three visits were made by waiters before an order was placed — the first was almost immediately after customers were seated, followed by two other visits with longer intervals. In older footage, there were only two visits on an average and with shorter intervals before an order was placed.

When the restaurateur investigated the interactions on the first visit and the driver of longer intervals in recent footages, it was found out that the reason had to do with their free WIFI services. Customers would ask for the WIFI passwords in the first visit of the waiter and set their phones up before they turned their attention to the menu and actually started making an order decision. The reason for longer order time had less to do with number of waiters, volume of customers and menu. The restaurateur could have saved time by analyzing the dark data in the form of CCTV footage first rather than going straight to big data that was easily analyzed.

The realization of the root causes of the customer behavior made it easier to address the problem. The restaurant now has WIFI password information readily available on all their tables.

Investing in big data is an edge and balancing it with investments in converting dark data will make it more effective. Breaking the constraints in analyzing dark data may require more investment but it equally provides the power of the comparative — seeing clearly what was different in the past can make better and more informed decisions.

The comfort of having masses of information and the capacity of analyzing it may cause dark data and its potential to be neglected. Swimming into deep open waters just because you can may not be the wisest. But navigating these waters with the knowledge of the past brought by dark data could mean your true edge in the digital world.

This article is for general information only and is not a substitute for professional advice where the facts and circumstances warrant. The views and opinion expressed above are those of the author and do not necessarily represent the views of SGV & Co.

Shane Dave D. Tanguin is a Partner of SGV & Co.

Leading the way in business

Other SGV News and Publications