My MIS 587 blog: February 2014

Friday, February 21, 2014

OBIEE:History, Architecture, Components with an Example

What is OBIEE?

OBIEE stands for Oracle Business Intelligence Enterprise Edition. It’s a platform for Oracle BI applications.

How OBIEE originated?

Hyperion Interactive Reporting (Brio) à OBIEE

Interactive Reporting was first developed by Brio Software and was known by various names like Brio Query, Explorer, Insight and finally Intelligence.

The earlier version of Hyperion Performance Suite was very basic. The Brio software added an intuitive user interface that supports query, reporting and alerting capabilities.

Following Figure explains how Brio Architectures was transformed to OBIEE.

What are the various functions of OBIEE?

1. Dashboards

2. BI Server

3. OLAP Server

4. Ad hoc query

5. Alerting

6. Pixel-Perfect reporting

7. Office

8. Balanced Scorecard

9. Print Quality Financial Report

OBIEE components can be seen in following figure:

Each of the component seen above can be used to meet some specific requirement.

For Example:

Publisher: We will discuss this component in more detail as below:

Question: What if we need to send the daily report in the same format to senior management, updated with the latest data?

Answer: We can use BI Publisher to create report and use Delivers (one more Component seen in above figure) to send it by email.

This is a simple example of one of the features of OBIEE. We can do a lot more with each of the components.

We will learn more about OBIEE and its usage in next blog. Stay tuned!

References:

http://www.kpipartners.com/blog/bid/146263/Converting-Hyperion-IR-Brio-to-OBIEE

http://www.ealliancebusinessintelligence.com/oraclebusinessintelligenceprimer.php

http://siebelbipblog.blogspot.com/2012/08/what-is-oracle-bi-publisher-brief.html

Thursday, February 13, 2014

More about Dimensional Hierarchy

After learning basics of BI, ETL and concepts of facts and dimensions. We would now learn about “Dimension Hierarchies”

What is Dimension Hierarchies?

Dimension Hierarchy is the way all the attributes in a Dimension table are connected to each other by position.

By Analogy,

As Dimensions are key to navigating data warehouse/business Intelligence system,

Hierarchies are key to navigating dimensions

Types of Dimension Hierarchies:

1. Fixed Depth Positional Hierarchies

Have fixed number of levels
Series of many to one relationship
Here each hierarchy level should appear as separate positional attributes in a dimension table
Set of Attributes should have meaningful name

Pros:

a. Easiest to understand

b. Predictable and fast query performance

Cons:

a. Cannot be used when hierarchy is not a series of many to one relationship

2 Slightly Ragged / Variable Depth Hierarchies

Don’t have fixed number of level, but the range in depth is small

Pros:

a. Consistent parent child relationship

Cons:

a. Inconsistent at logical level

3 Ragged/Variable Depth Hierarchies

This can be best explained by an example as above. Here we have atleast one member whos parent belongs to a hierarchy that is more than one level above the child.

This can be divided in 2 categories:

1. Ragged/Variable Depth Hierarchies with Hierarchy Bridge Tables

It has a specially constructed bridge table .This table contains a row for every possible path in the ragged hierarchy and enables all forms of hierarchy transversal to be accomplished with standard SQL rather than using special language extension which is not possible otherwise.

2. Ragged/Variable Depth Hierarchies with PathString Attributes

Bridge Table Creation can be avoided just by using a PathString attribute in the dimension. It contains a specially encoded text String which contains the complete path from the supreme node to the node described by the particular dimension row.

1 of the above 2 method for Variable Depth Hierarchies is selected based on Technical and organisational needs.

We will discuss some more interesting topic in next blog. Stay tuned.

References:

http://www.informationweek.com/software/information-management/kimball-university-maintaining-dimension-hierarchies/d/d-id/1073326?page_number=1

http://technet.microsoft.com/en-us/library/aa198068(v=sql.80).aspx

http://pic.dhe.ibm.com/infocenter/dataarch/v8r1/index.jsp?topic=%2Fcom.ibm.datatools.dimensional.ui.doc%2Ftopics%2Fc_dm_hierarchy_ragged.html

The Datawarehouse Tool Lit - Ralph Kimball, Margy Ross

Thursday, February 6, 2014

Brief Overview of ETL

In my previous blog post titled ‘Business Intelligence for Beginners’ we learnt basics of BI, Star Schema and defined few related terms.

In this post we will learn what ETL is in more detail. ETL stands for Extract, Transform, Load and is a process of extracting data from various sources, transforming them in a common format and loading the data in Data Warehouse. This data is then used for decision making.

In General, ETL process takes 70% of the DW development time.

Extraction:

Extraction is a process of getting data from various sources and transferring to Data warehouse. It’s not as simple as copy paste, its complex. There are various factors to it, which we will discuss further.

This is the most complex and time consuming part in the entire data warehousing process because of following reasons:

1. The source system may be poorly designed.

2. Deciding which data needs to be extracted from every source.

3. Data needs to be extracted from heterogeneous and disparate system.

4. Sources may have different Operating systems, DBMS, hardware and communication protocol.

Some Examples of Sources are ERP, CRM, Flat files, Batch File, text files, spreadsheets.

Also, this is not a onetime activity. As the source data gets updated Data warehouse also needs to be periodically (hourly, monthly or yearly) updated for getting up to date Information from raw data.

There are two extraction methods namely, Logical Extraction Method and Physical Extraction Method. We first need to decide how the data can be extracted logically before extracting it physically.

Transformation:

Transformation can be defined as applying set of rules/programs/functions to the data extracted from various sources.

This step is responsible for Data Validation, Accuracy, and Conversion of source data type according to the business rule of the organization.

Also, once the data is transformed data cleansing is done before loading data to target database.

Few forms of transformation:

1. Simple one-to-one scalar transformation

- 0/1 for M/F of data

2. One to many element transformation

- 4*20 address field for Street Address/City/Zip/Country

3. Many to many element transformation

- Detailed information of every individual

Example: First column in all 3 examples shown below shows the data extracted and the later shows how the data resembles after Transformation.

Loading:

Loading is a process of loading the transformed data to data warehouse in a particular format depending on organisational need. We need to load both facts and dimensions.

Using Following 3 strategies data can be uploaded:

1. Full data refresh

2. Incremental data refresh

3. Continuous data refresh with constant data collection and loading using row level insert.

We will go in more depth in next blog-posts. Stay tuned!

References:

http://igorportela.com/extract-transform-and-load-etl/

http://docs.oracle.com/cd/B10501_01/server.920/a96520/extract.htm

http://www.zeepedia.com/read.php?issues_of_etl_legacy_data_web_scrapping_data_quality_etl_vs_elt_data_warehousing&b=6&c=18