Data Warehouse Architectures
I would like to talk about the two most important models of the Data Warehouse architect. These models are Bill Inmon and Kimballs models. I will not recommend which one to use in this article. We will compare 2 models
Inmon’s Enterprise Data Warehouse
These systems feed a process labeled ETL for “extract, transform, load.” This process consolidates information from the various operational systems, integrates it, and loads it into a single repository called the enterprise data warehouse. This processing step is nontrivial. It may require accessing information in a variety of different formats, resolving differing representations of similar things, and significant restructuring of data. Some organizations refer to this process as data integration. It may be a batch process that runs periodically or a transaction-based process that occurs in near real time. The final result is the same: the enterprise data warehouse.
The enterprise data warehouse is the hub of the corporate information factory. It is an integrated repository of atomic data. Integrated from the various operational systems, it contains a definitive and consistent representation of business activities in a single place. Atomic in nature, the data in this repository is captured at the lowest level of detail possible.
As can be seen in the picture above, it consists of multiple parts. Step by Step, we will explain them.
Operational Systems = Can be called data storage partition. It refers to the department that provides data to the data warehouse such as sales, marketing, material management and collection through transactional systems.
ETL(Extract, Transform and Load) = A process called ETL takes place to fetch data from the source. It takes the data and converts it into a format and is transferred to a pool called DWH or EDW.
Enterprise Data Warehouse = EDW is a central element in Inmon’s data warehouse architecture. EDW captures the lowest possible level of detail.
Data Marts = Provides the representation of subject-based information as a department. Data Marts gets the data from EDW. Data is moved from EDW with data mart summary. (Aggregations). Data marts uses dimentional design. In this section, data is ready for analysis.
Kimball’s Dimensional Data Warehouse
Operational applications and ETL part have the same working structure with Inmon’s architecture. Inmon Dimensional model only supports the structure that will be used in the data mart part. Kimball supports all data to be in dimensional structure. While Kimbal uses the star or snowflake model to edit dimensional DWH, Inmon enterprise uses the data warehose ER model.
Inmon, data mart enterprise is used to physically separate data warehouse. Kimball data mart is pointless to separate the data warehouse.
Kimball dimensional data warehouse can directly access data in analytical systems. In Inmon architecture, on the other hand, EDW reaches data only through Data marts.
Bill Inmon and Kimball Comparison
Characteristics | Kimball | Inmon |
Business decision support requirements | Tactical | Strategic |
Data integration requirements | Individual business requirements | Enterprise-wide integration |
The structure of data | KPI, business performance measures, scorecards… | Data that meet multiple and varied information needs and non-metric data |
Persistence of data in source systems | Source systems are quite stable | Source systems have high rate of change |
Skill sets | Small team of generalists | Bigger team of specialists |
Time constraint | Urgent needs for the first data warehouse | Longer time is allowed to meet business’ needs. |
Cost to build | Low start-up cost | High start-up costs |