The massive growth of unstructured or semi-structured data is amazing and has implications for data warehouse / business intelligence / data analytics architecture and database design. The way we capture, store, analyze, and distribute data is transforming. New "information taming" technologies such as deduplication, compression, and analysis tools are driving down the cost of creating, capturing, managing, and storing information to one-sixth the cost in 2011 in comparison to 2005.
So, like our physical universe, the digital universe is something to behold — 1.8 trillion gigabytes in 500 quadrillion "files" — and more than doubling every two years. That's nearly as many bits of information in the digital universe as stars in our physical universe.
What are the forces behind the explosive growth of the digital universe? The cost of creating, capturing, managing, and storing information down to one-sixth of what it was in 2005. Since 2005, the investment by enterprises in
the digital universe has increased 50% — to USD $4 trillion. That's money spent on hardware, software, services, and staff to create, manage, and store — and derive revenues from — digital data.
In an information society, information is money. The trick is to generate value by extracting the right information from data. Considering new data analytical tools and technologies, and new business / knowledge processes and organizational practices, we may be on the threshold of a major period of exploration of the digital universe. The convergence of technologies and data science now makes it possible not only to transform the way business is conducted and managed but also to alter the way we work and live.
New capture, search, discovery, and analysis tools can help organizations gain insights from their unstructured data, which accounts for more than 90% of the digital universe. These tools can create data about data automatically, much like facial recognition routines that help tag Facebook photos.
Data about data, or metadata, is growing twice as fast as the digital universe as a whole. IDC defines "big data" as follows:
"Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and / or analysis."