What is a Data Warehouse?

A Data Warehouse is a central database fed from numerous internal and external data sources whose primary purpose is enterprise-wide decision support.
  • A Data Warehouse is usually fed from several disconnected data sources.
  • It usually contains large amount of current and historic data, to enable trend-analysis and data mining.
  • It can start as a single subject area (Data Mart) and evolve to a full blown enterprise-wide Data Warehouse over time.
  • Unlike a Data Mart, a Data Warehouse is not departmental, and is usually used by the entire organization.
  • Unlike an Operational Data Store, there are usually no on-line data maintenance processes that directly update the Data Warehouse.

 

Why build a Data Warehouse?

A Data Warehouse can turn your company's data into timely information, facilitate intelligent decision making and give your organization a competitive edge.

A good Data Warehouse can:

  • Allow companies to build connections between the various disconnected islands of information
  • Provide one-stop shopping for all decision support and ad-hoc reporting needs
  • Provide a business view that simplifies end-user data access (masks underlying complexity of operational database structures)
  • Offload reporting and decision support processing from operational systems and onto the Data Warehouse (improved performance and reduced contention with OLTP activities)
  • Make end-users more self-sufficient by distributing the reporting and decision support function throughout the organization
  • Free-up I.T. resources for other (more complicated) development activities including building the Data Warehouse

Information is power: providing your decision makers with access to complete, timely and reliable information will equip them to make better and more informed decisions.

 

Why do many Data Warehouses fail?

There are many reasons why Data Warehouse efforts fail. Here are some of the most common reasons:

Lack of Business Sponsorship (Buy-in)
The "If you build it they will come" scenario doesn't usually work.

Tackling Too Much Too Soon
By the time the Data Warehousing concept gains wide acceptance (and budget approval), ideas start flowing and everybody wants their data right away. This usually leads to over-committment, missed deadlines and cost over-runs.

Lack of DW expertise
Building a Data Warehouse requires an entirely different set of design and development expertise compared to building an OLTP application.

Poor DW design
Lack of DW expertise usually leads to poor initial DW database design - novice DW development teams tend to create inflexible DW models that are not condusive to change and/or the evolution of the Data Warehouse.

Lack of data integrity
Due to the complexity of managing and integrating multiple sets of source data, some DW efforts wind up with incorrect and/or duplicate data.

Missed deadlines and cost over-runs
Novice DW development teams tend to lack the expertise required to accurately estimate DW efforts, they also tend to lack the problem-solving bag of tricks required to keep the DW project on track. This usually results in missed deadlines, cost over-runs and erosion of end user confidence in the DW team.

Lack of timely delivery of new information
Once the initial phase of the Data Warehouse has been rolled out, and assuming that it was successful, demand for additional information (new subject areas) grows exponentially. This causes DW teams to delay and/or turn down requests, which results in disappointed end-users and the dreaded "too little, too late" scenario.

End-user analysis and reporting tools are too complex
Typical end-user Business Intelligence tools are overly complex for the infrequent and/or non-technically savy end-user. A large segment of the end-user community often reverts back to the operational reports and/or delegates reporting requests back to I.T. This unfortunately erodes one of the main benfits of building a Data Warehouse, which is empowering end-users and making them self-sufficient when it comes to decision support.

 

How do I ensure the success of my Data Warehouse?

1. Secure a strong Business Sponsor
A strong business sponsor will help secure funding and will help synchronize the data warehouse evolution with future business requirement changes.

2. Under-commit and over-deliver for your first phase
The Data Warehouse team's credibility is at stake (especially during the initial phases of the data warehouse). DW teams need to give themselves sufficient elbow room... in order to make sure that the initial phases are delivered on time, on budget, and that data integrity id paramount.

3. Supplement your team with seasoned DW experts
For the initial phases of the data warehouse, and until the development team is fully versed in the DW Design and Development activities, it is highly recommended that the internal design/development team is supplemented with 1 or 2 DW experts. These experts will bring proven DW design methodologies to the table, along with a substantial problem-solving bag of tricks, and will expedite the learning curve for the entire internal DW design/development team.

4. Design your DW for growth and constant change
The DW Database design is the foundation layer, on which future phases of the data warehouse are based - need to make sure that the foundation layer is flexible and robust enough to accomdate future DW changes and the addition of new subject areas without having to undo or redo previously built DW components.

5. Invest in a good ETL tool, it should pay for itself in the first phase of your DW project
Custom coding DW interfaces from scratch is both in-efficient and time consuming. ETL tools come pre-built with all the functionality one would need to develop even the more complex interfaces. Typical ETL tools will yield at a minimum 50% productivity improvement (i.e. at a minimum). you will be able to develop twice as many interfaces in the same timeframe. Not all ETL Tools are expensive, there are several full-featured second generation ETL tools on the market (like the SQL Power Loader) that sell for under $50,000.

6. Data integrity is paramount, since Data Integrity = End User Confidence = DW Credibility
Need to audit the integrity of the Data Warehouse on an on-going basis, to ensure that any data integrity issues are identified, communicated to the end-user and rectified in a timely fashion. Lack of end-user confidence in the underlying data will usually spell the end of their use of the Data Warehouse, as they will revert back to their operational reports and their source systems for more accurate and reliable data.

7. Promote the constant growth of the data warehouse: a static Data Warehouse is a dead Data Warehouse
A data warehouse that does not keep up with the evolution of business and data requirements will be obsolete very quickly. Plan on at least 2 or 3 data warehouse releases in the course of the year - this will keep the information fresh, should enable you to meet end-user demand for new information in a timely fashion, and should keep the DW users interest in the Data Warehouse high.

8. Involve the End-Users from start to finish
Business Users (especially power-users) are the Data Warehouse team's greatest allies: they will help direct the DW evolution, they will help you promote the use of the Data Warehouse, and they will be the first line support much of the data and reporting questions.