What is a Data Warehouse?
A Data Warehouse is a central database fed from numerous
internal and external data sources whose primary purpose is
enterprise-wide decision support.
- A Data Warehouse is usually fed from several disconnected
data sources.
- It usually contains large amount of current and historic
data, to enable trend-analysis and data mining.
- It can start as a single subject area (Data Mart) and
evolve to a full blown enterprise-wide Data Warehouse over
time.
- Unlike a Data Mart, a Data Warehouse is not departmental,
and is usually used by the entire organization.
- Unlike an Operational Data Store, there are usually no
on-line data maintenance processes that directly update the
Data Warehouse.
Why build a Data Warehouse?
A Data Warehouse can turn your company's data into timely
information, facilitate intelligent decision making and give
your organization a competitive edge.
A good Data Warehouse can:
- Allow companies to build connections between the various
disconnected islands of information
- Provide one-stop shopping for all decision support and
ad-hoc reporting needs
- Provide a business view that simplifies end-user data
access (masks underlying complexity of operational database
structures)
- Offload reporting and decision support processing from
operational systems and onto the Data Warehouse (improved
performance and reduced contention with OLTP activities)
- Make end-users more self-sufficient by distributing the
reporting and decision support function throughout the
organization
- Free-up I.T. resources for other (more complicated)
development activities including building the Data
Warehouse
Information is power: providing your decision makers
with access to complete, timely and reliable information will
equip them to make better and more informed decisions.
Why do many Data Warehouses fail?
There are many reasons why Data Warehouse efforts fail. Here
are some of the most common reasons:
Lack of Business Sponsorship (Buy-in)
The "If you build it they will come" scenario doesn't usually
work.
Tackling Too Much Too Soon
By the time the Data Warehousing concept gains wide acceptance
(and budget approval), ideas start flowing and everybody wants
their data right away. This usually leads to
over-committment, missed deadlines and cost
over-runs.
Lack of DW expertise
Building a Data Warehouse requires an entirely different set
of design and development expertise compared to building an
OLTP application.
Poor DW design
Lack of DW expertise usually leads to poor initial DW database
design - novice DW development teams tend to create inflexible
DW models that are not condusive to change and/or the evolution
of the Data Warehouse.
Lack of data integrity
Due to the complexity of managing and integrating multiple
sets of source data, some DW efforts wind up with incorrect
and/or duplicate data.
Missed deadlines and cost over-runs
Novice DW development teams tend to lack the expertise
required to accurately estimate DW efforts, they also tend to
lack the problem-solving bag of tricks required to keep the DW
project on track. This usually results in missed deadlines,
cost over-runs and erosion of end user confidence in the DW
team.
Lack of timely delivery of new information
Once the initial phase of the Data Warehouse has been rolled
out, and assuming that it was successful, demand for additional
information (new subject areas) grows exponentially. This
causes DW teams to delay and/or turn down requests, which
results in disappointed end-users and the dreaded "too little,
too late" scenario.
End-user analysis and reporting tools are too
complex
Typical end-user Business Intelligence tools are overly
complex for the infrequent and/or non-technically savy
end-user. A large segment of the end-user community often
reverts back to the operational reports and/or delegates
reporting requests back to I.T. This unfortunately erodes one
of the main benfits of building a Data Warehouse, which is
empowering end-users and making them self-sufficient when it
comes to decision support.
How do I ensure the success of my Data Warehouse?
1. Secure a strong Business Sponsor
A strong business sponsor will help secure funding and will
help synchronize the data warehouse evolution with future
business requirement changes.
2. Under-commit and over-deliver for your first
phase
The Data Warehouse team's credibility is at stake (especially
during the initial phases of the data warehouse). DW teams need
to give themselves sufficient elbow room... in order to make
sure that the initial phases are delivered on time, on budget,
and that data integrity id paramount.
3. Supplement your team with seasoned DW experts
For the initial phases of the data warehouse, and until the
development team is fully versed in the DW Design and
Development activities, it is highly recommended that the
internal design/development team is supplemented with 1 or 2 DW
experts. These experts will bring proven DW design
methodologies to the table, along with a substantial
problem-solving bag of tricks, and will expedite the learning
curve for the entire internal DW design/development team.
4. Design your DW for growth and constant change
The DW Database design is the foundation layer, on which
future phases of the data warehouse are based - need to make
sure that the foundation layer is flexible and robust enough to
accomdate future DW changes and the addition of new subject
areas without having to undo or redo previously built DW
components.
5. Invest in a good ETL tool, it should pay for itself in
the first phase of your DW project
Custom coding DW interfaces from scratch is both in-efficient
and time consuming. ETL tools come pre-built with all the
functionality one would need to develop even the more complex
interfaces. Typical ETL tools will yield at a minimum 50%
productivity improvement (i.e. at a minimum). you will be able
to develop twice as many interfaces in the same timeframe. Not
all ETL Tools are expensive, there are several full-featured
second generation ETL tools on the market (like the
SQL Power Loader) that sell for under $50,000.
6. Data integrity is paramount, since Data Integrity =
End User Confidence = DW Credibility
Need to audit the integrity of the Data Warehouse on an
on-going basis, to ensure that any data integrity issues are
identified, communicated to the end-user and rectified in a
timely fashion. Lack of end-user confidence in the underlying
data will usually spell the end of their use of the Data
Warehouse, as they will revert back to their operational
reports and their source systems for more accurate and reliable
data.
7. Promote the constant growth of the data warehouse: a
static Data Warehouse is a dead Data Warehouse
A data warehouse that does not keep up with the evolution of
business and data requirements will be obsolete very quickly.
Plan on at least 2 or 3 data warehouse releases in the course
of the year - this will keep the information fresh, should
enable you to meet end-user demand for new information in a
timely fashion, and should keep the DW users interest in the
Data Warehouse high.
8. Involve the End-Users from start to finish
Business Users (especially power-users) are the Data Warehouse
team's greatest allies: they will help direct the DW evolution,
they will help you promote the use of the Data Warehouse, and
they will be the first line support much of the data and
reporting questions.