A traditional BI architecture has analytical processing first pass through a data warehouse.
In the new, modern BI architecture, data reaches users through a multiplicity of organization data structures, each tailored to the type of content it contains and the type of user who wants to consume it.
The data revolution (big and small data sets) provides significant improvements. New tools like Hadoop allow organizations to cost-effectively consume and analyze large volumes of semi-structured data. In addition, it complements traditional top-down data delivery methods with more flexible, bottom-up approaches that promote predictive or exploration analytics and rapid application development.
In the above diagram, the objects in blue represent traditional data architecture. Objects in pink represent the new modern BI architecture, which includes Hadoop, NoSQL databases, high-performance analytical engines (e.g. analytical appliances, MPP databases, in-memory databases), and interactive, in-memory visualization tools.
Most source data now flows through Hadoop, which primarily acts as a staging area and online archive. This is especially true for semi-structured data, such as log files and machine-generated data, but also for some structured data that cannot be cost-effectively stored and processed in SQL engines (e.g. call center records).
From Hadoop, data is fed into a data warehousing hub, which often distributes data to downstream systems, such as data marts, operational data stores, and analytical sandboxes of various types, where users can query the data using familiar SQL-based reporting and analysis tools.
Today, data scientists analyze raw data inside Hadoop by writing MapReduce programs in Java and other languages. In the future, users will be able to query and process Hadoop data using familiar SQL-based data integration and query tools.
The modern BI architecture can analyze large volumes and new sources of data and is a significantly better platform for data alignment, consistency and flexible predictive analytics.
Thus, the new BI architecture provides a modern analytical ecosystem featuring both top-down and bottom-up data flows that meet all requirements for reporting and analysis.
In the top-down world, source data is processed, refined, and stamped with a predefined data structure--typically a dimensional model--and then consumed by casual users using SQL-based reporting and analysis tools. In this domain, IT developers create data and semantic models so business users can get answers to known questions and executives can track performance of predefined metrics. Here, design precedes access. The top-down world also takes great pains to align data along conformed dimensions and deliver clean, accurate data. The goal is to deliver a consistent view of the business entities so users can spend their time making decisions instead of arguing about the origins and validity of data artifacts.
Creating a uniform view of the business from heterogeneous sets of data is not easy. It takes time, money, and patience, often more than most departmental heads and business analysts are willing to tolerate. They often abandon the top-down world for the underworld of spreadmarts and data shadow systems. Using whatever tools are readily available and cheap, these data hungry users create their own views of the business. Eventually, they spend more time collecting and integrating data than analyzing it, undermining their productivity and a consistent view of business information.
The bottom up world is a different process. Modern BI architecture creates an analytical ecosystem that brings prodigal data users back into the fold. It allows an organization to perform true ad hoc exploration (predictive or exploratory analytics) and promotes the rapid development of analytical applications using in-memory departmental tools. In a bottom-up environment, users can't anticipate the questions they will ask on a daily or weekly basis or the data they'll need to answer those questions. Often, the data they need doesn't yet exist in the data warehouse.
The modern BI architecture creates analytical sandboxes that let power users explore corporate and local data on their own terms. These sandboxes include Hadoop, virtual partitions inside a data warehouse, and specialized analytical databases that offload data or analytical processing from the data warehouse or handle new untapped sources of data, such as Web logs or machine data. The new environment also gives department heads the ability to create and consume dashboards built with in-memory visualization tools that point both to a corporate data warehouse and other independent sources.
Combining top-down and bottom-up worlds is challenging but doable with determined commitment.
BI professionals need to guard data semantics while opening access to data.
Business users need to commit to adhering to data standards.
Further, well designed data governance programs are an absolute requirement.