Enterprise data warehouse



Submitted by                              Muhammad Bilal

Submitted to                                 Prof A.G

Topic                              Enterprise data warehouse

Reg no                                          v1f14asoc0014






What is enterprise data warehouse?
In computing, a data warehouse DW or DWH, also known as an enterprise data warehouse(EDW), is a system used for reporting and data analysis. DWs are central repositories of integrated data from one or more disparate sources.
What does a data warehouse do?
Integrate divergent information from various systems which enable users to quickly produce powerful ad enable users to quickly produce powerful ad- -hoc queries hoc queries and perform complex analysis and perform complex analysis
Create an infrastructure for reusing the data in numerous Create an infrastructure for reusing the data in numerous ways ways
Create an open systems environment to make useful Create an open systems environment to make useful information easily accessible to authorized users information easily accessible to authorized users
Help managers make informed decisions
 Data Warehouse Architecture
Different data warehousing systems have different structures. Some may have an ODS (operational data store), while some may have multiple data marts. Some may have a small number of data sources, while some may have dozens of data sources. In view of this, it is far more reasonable to present the different layers of a data warehouse architecture rather than discussing the specifics of any one system.
In general, all data warehouse systems have the following layers:
·  Data Source Layer
·  Data Extraction Layer
·  Staging Area
·  ETL Layer
·  Data Storage Layer
·  Data Logic Layer
·  Data Presentation Layer
·  Metadata Layer
·  System Operations Layer
The picture below shows the relationships among the different components of the data warehouse architecture:

Each component is discussed individually below:
Data Source Layer
This represents the different data sources that feed data into the data warehouse. The data source can be of any format -- plain text file, relational database, other types of database, Excel file, etc., can all act as a data source.
Many different types of data can be a data source:
·  Operations -- such as sales data, HR data, product data, inventory data, marketing data, systems data.
·  Web server logs with user browsing data.
·  Internal market research data.
·  Third-party data, such as census data, demographics data, or survey data.
All these data sources together form the Data Source Layer.
Data Extraction Layer
Data gets pulled from the data source into the data warehouse system. There is likely some minimal data cleansing, but there is unlikely any major data transformation.
Staging Area
This is where data sits prior to being scrubbed and transformed into a data warehouse / data mart. Having one common area makes it easier for subsequent data processing / integration.
ETL Layer
This is where data gains its "intelligence", as logic is applied to transform the data from a transactional nature to an analytical nature. This layer is also where data cleansing happens. The ETL design phase is often the most time-consuming phase in a data warehousing project, and an ETL tool is often used in this layer.
Data Storage Layer
This is where the transformed and cleansed data sit. Based on scope and functionality, 3 types of entities can be found here: data warehouse, data mart, and operational data store (ODS). In any given system, you may have just one of the three, two of the three, or all three types.
Data Logic Layer
This is where business rules are stored. Business rules stored here do not affect the underlying data transformation rules, but do affect what the report looks like.
Data Presentation Layer
This refers to the information that reaches the users. This can be in a form of a tabular / graphical report in a browser, an emailed report that gets automatically generated and sent everyday, or an alert that warns users of exceptions, among others. Usually anOLAP tool and/or a reporting tool is used in this layer.
Metadata Layer
This is where information about the data stored in the data warehouse system is stored. A logical data model would be an example of something that's in the metadata layer. A metadata tool is often used to manage metadata.
System Operations Layer
This layer includes information on how the data warehouse system operates, such as ETL job status, system performance, and user access history.

Data Marts, and Data Mining
Data mart: a subset of a data warehouse
Data mining: an information-analysis tool for automated discovery of patterns and relationships in a data warehouse or a data mart
Online Analytical Processing -Graphical software tools that provide complex analysis of data stored in a database

0 comments: