I don’t pretend to be an expert on big data. Like many I am still learning. But I do know that there is confusion and many don’t know if big data is a platform, process, methodology or just simply lots of data. What the BIWisdom Tribe seemed to focus on was that Big Data is just that… lots of data.
Big data is important but the most important thing to understand is how it effects an organization. Once realized, that can open the door to a Data Discovery plan to capture all available data. But there are ultimately challenges.
Collection of historical data
Organizations have collected all sorts of data throughout its existence. Generally it is the responsibility of an individual or small group of individuals to store the data. Hopefully when it is time for your company to implement Big Data there is at least one person who knows where the data is located (employees can come and go!). It’s also important to incorporate a data cleansing processes; data quality is more important than data quantity.
Big Data requires new thinking about architecture. With the Data Scientist craze do you think it would be beneficial to hire one? Data scientists are analytically-minded, statistically and mathematically sophisticated data engineers who have insights into business and other complex systems based on large quantities of data. These computer scientists can program, build software, combine and manage data from a variety of sources. They are great statisticians who know how to derive insights from large data sets. These individuals are hard to come by so are often scooped up by Fortune 500 companies.
But that is okay because tools today are accessible and can handle loads of data. Organizations that train the right person using the right tools can make a difference when given the freedom to explore. You will be surprised where you find this talent!
Access to a ton of data is great but Data Discovery is not going to take care of itself. If you want to get the most out of your information, it may be wise to take a corporate survey to find out what management wants to know. Without a plan Data Discovery is a waste of time and money. It is impossible to know how to structure the new data if management doesn’t understand what it consists of, how it may help them or even what they hope to gain. Pre-planning before diving into the data gives direction and purpose.
Unstructured data consists of emails, open-ended survey questions, web forms, call logs, discussion boards, SharePoint and Wikis to name just a few. What all seem to agree on is that all of these sources contain important information. Organizations have struggled to find ways to analyze and leverage this data. Platforms such as the Attivio Active Intelligence Engine® (AIE) address the need to draw in-depth understanding from unstructured content and integrate it with Big Data.
I understand that the implementation of any Big Data project is quite costly. And I also understand that there is only so much money in a budget in a given year. Convincing the individuals who hold the purse strings can be difficult without compelling evidence. Until executives realize that their issues may be caused by their data availability and limited data assets they will not support a Big Data project.
Big Data is not about the technology but about the way it helps organizations with their biggest analytical and data challenges.
I invite all of you to participate in the weekly BIWisdom chat on Twitter hosted by Howard Dresner, every Friday at 1:00pm, EST. Search #BIWisdom and follow the feed. I look forward to seeing you there. You can also follow me on Twitter @CindyBHarder.