OLAP - On-Line Analytical Processing



A data warehouse has no use by itself. As it holds raw data, further processing of the content is required. There is no reason for copying or transforming the data into a new database if no one benefits from it. The data contained in a data warehouse can take two different routes: either OLAP analysis or Data Mining. The first will be covered in the following, while data mining will be briefly presented in the next section.
OLAP and data warehouses complement each other. The data warehouse stores and manages the data, while OLAP converts the stored data into useful information so that it reflects the real factors affecting or enhancing the line of business of the enterprise. Raw data is collected, reorganized, stored, and managed into a data warehouse that follows a special schema, whereupon OLAP converts this data to information that helps make good use of it.
The purpose of OLAP is to provide end users with summary aggregations, in order to gain a high-level understanding of their data.
3.1.2.1 Definition
E.F. Codd, the inventor of relational databases and one of the greatest database researchers, first coined the term OLAP in a white paper published in 1993.[ ] The white paper defined 12 rules for OLAP applications. Nigel Pendse and Richard Creeth of the OLAP Report [ ] simplified the definition of OLAP applications as those that should deliver fast analysis of shared multidimensional information (FASMI). This statement means:



 Fast: the interactive user expects the delivery of the information at a fairly constant rate. Most queries should be delivered to the user in five seconds or less, and many of these queries will be ad hoc queries as opposed to rigidly predefined reports. For instance, the end user will have the flexibility of combining several attributes in order to generate a report based on the data in the data warehouse
 Analysis: OLAP applications should perform basic numerical and statistical analysis of the data. These calculations could be pre-defined by the application developer, or defined by the user as ad hoc queries. It is the ability to conduct such calculations that makes OLAP so powerful, allowing the addition of hundreds, thousands, or even millions of records to come up with the hidden information within the piles of raw data.
 Shared: the data delivered by OLAP applications should be shared across a large user population, as seen in the current trend to web-enable OLAP applications allowing the generation of OLAP reports over the Internet.
 Multidimensional: OLAP applications are based on data warehouses built on multidimensional database schemas, which is an essential characteristic of OLAP.
 Information: OLAP applications should be able to access all the data and information necessary and relevant for the application. To give an example, in a banking scenario, an OLAP application working with annual interest, or statement reprints would be required to access historical transactions in order to calculate and process the correct information. Not only is the data likely to be located in different sources, but its volume is liable to be large.
3.1.2.2 Benefits and beneficiaries of OLAP
OLAP tools can improve the productivity of the whole organization by focusing on what is essential for its growth. An advantage of using OLAP systems is that if such systems are separate from the On-Line Transaction Processing (OLTP) systems that feed the data warehouse, the OLTP systems\' performance will improve due to the reduced network traffic and elimination of long queries to the OLTP database.
OLAP enables the organization as a whole to respond more quickly to market demands. This is possible because it provides the ability to model real business problems, make better-informed decisions for the conduct of the organization, and use human resources more efficiently. Market responsiveness, in turn, often yields improved revenue and profitability.
Tools and applications of OLAP can be used by a variety of organizational divisions, such as sales, marketing, finance, manufacturing etc. For all the types of OLAP users above, OLAP will deliver the information they need to make effective decisions about their organization\'s line of business and future directions. The information delivered by the OLAP tools is delivered fast, and just-in-time when needed. This fast delivery of information is the key to successful OLAP applications. Time is the critical piece to make really effective decisions.
3.1.2.3 Features of OLAP
OLAP applications are found in a wide variety of functional areas of an organization. However, no matter what functions are served by an OLAP application, it must always have the following elements:
a. Multidimensional Views of data (data cubes)
Business models are multidimensional in nature. Several dimensions can be identified for the business of a company: time, location, product, people, and so on. One dimension can have multiple levels. The location dimension can have levels such as city, state, country, and so on.
Using OLAP applications, managers should be able to analyze data across any dimension, at any level of aggregation, with equal functionality and ease.
The multidimensional data views are usually referred to as data cubes. Since we typically think of a cube as having three dimensions, this may be a bit of a misnomer. In reality data cubes can have as many dimensions as the business model allows.
b. Calculation capabilities
OLAP applications do simple data aggregation along a hierarchy like a cube or a dimension, while some of them may conduct more complex calculations. It is the ability to conduct complex calculations by the OLAP applications that allows successful transfer of the raw data to information, and later to knowledge.
c. Time intelligence
Time is a universal dimension for almost all OLAP applications. It is very difficult to find a business model where time is not considered an integral part. Time is used to compare and judge performance of a business process. An OLAP system should be built to easily allow for concepts like \"year to date\" and \"period over period comparisons\" to be defined.