1. Trang chủ
  2. » Công Nghệ Thông Tin

slide cơ sở dữ liệu tiếng anh chương (33) olap transparencies

93 285 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 93
Dung lượng 0,95 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

OLAP Applications ◆ Although OLAP applications are found in widely divergent functional areas, they all have the following key features: – multi-dimensional views of data – support for

Trang 1

Chapter 33

OLAP Transparencies

Trang 2

The key features of OLAP applications.

The potential benefits associated with successful OLAP applications.

Trang 3

Chapter 33 - Objectives

How to represent multi-dimensional data.

The rules for OLAP tools.

The main categories of OLAP tools.

OLAP extensions to the SQL standard.

How Oracle supports OLAP.

Trang 4

Business Intelligence Technologies

Accompanying the growth in data warehousing

is an ever-increasing demand by users for more powerful access tools that provide advanced

analytical capabilities

There are two main types of access tools

available to meet this demand, namely Online

Trang 5

Business Intelligence Technologies

OLAP and Data Mining differ in what they offer the user and because of this they are

Trang 6

Online Analytical Processing (OLAP)

The dynamic synthesis, analysis, and

consolidation of large volumes of dimensional data, Codd (1993).

multi-◆ Describes a technology that uses a

multi-dimensional view of aggregate data to provide quick access to strategic information for the

Trang 7

Online Analytical Processing (OLAP)

Enables users to gain a deeper understanding

and knowledge about various aspects of their corporate data through fast, consistent,

interactive access to a wide variety of possible views of the data

Allows users to view corporate data in such a

way that it is a better model of the true dimensionality of the enterprise.

Trang 8

Online Analytical Processing (OLAP)

Can easily answer ‘who?’ and ‘what?’ questions, however, ability to answer ‘what if?’ and ‘why?’ type questions distinguishes OLAP from general- purpose query tools

Types of analysis ranges from basic navigation and browsing (slicing and dicing) to calculations,

Trang 9

OLAP Benchmarks

OLAP Council published an analytical

processing benchmark referred to as the APB-1 (OLAP Council, 1998)

Aim is to measure a server’s overall OLAP

performance rather than the performance of individual tasks

Trang 10

OLAP Benchmarks

APB-1 assesses the most common business

operations including:

– bulk loading of data from internal or

external data sources

– incremental loading of data from

operational systems;

– aggregation of input level data along

hierarchies;

Trang 11

OLAP Benchmarks

APB-1 assesses the most common business

operations including (continued):

– calculation of new data based on business

models;

– time series analysis;

– queries with a high degree of complexity;

– drill-down through hierarchies;

– ad hoc queries;

– multiple online sessions

Trang 12

OLAP Benchmarks

OLAP applications are judged on their ability to provide just-in-time (JIT) information, a core requirement of supporting effective decision- making

This requirement is more than measuring

processing performance but includes its abilities

Trang 13

performance, and query performance into a singe metric

Trang 14

OLAP Benchmarks

Publication of APB-1 benchmark results must include both the database schema and all code required for executing the benchmark

An essential requirement of all OLAP

applications is the ability to provide users with JIT information, which is necessary to make

Trang 16

Examples of OLAP applications in various

functional areas

Trang 17

OLAP Applications

Although OLAP applications are found in widely divergent functional areas, they all have the

following key features:

– multi-dimensional views of data – support for complex calculations – time intelligence

Trang 18

OLAP Applications - multi-dimensional

Trang 19

OLAP Applications - support for complex

Mechanisms for implementing computational

methods should be clear and non-procedural.

Trang 20

OLAP Applications – time intelligence

Key feature of almost any analytical application

as performance is almost always judged over time.

Time hierarchy is not always used in the same manner as other hierarchies.

Trang 21

OLAP Benefits

Increased productivity of end-users.

Reduced backlog of applications development for

IT staff.

Retention of organizational control over the

integrity of corporate data.

Reduced query drag and network traffic on

OLTP systems or on the data warehouse

Improved potential revenue and profitability.

Trang 22

Representation of Multi-dimensional Data

Example of two-dimensional query.

» What is the total revenue generated by property sales in each city, in each quarter of 2004?’

Choice of representation is based on types of

queries end-user may ask

Trang 23

Multi-dimensional Data as Three-field table

versus Two-dimensional Matrix

Trang 24

Representation of Multi-dimensional Data

Example of three-dimensional query.

– ‘What is the total revenue generated by property

sales for each type of property (Flat or House) in each city, in each quarter of 2004?’

Compare representation - four-field relational table versus three-dimensional cube.

Trang 25

Multi-dimensional Data as Four-field Table

versus Three-dimensional Cube

Trang 26

Representation of Multi-dimensional Data

Cube represents data as cells in an array.

Relational table only represents

multi-dimensional data in two dimensions.

Trang 27

Representation of Multi-dimensional Data

Use multi-dimensional structures to store data and relationships between data

Multi-dimensional structures are best visualized as cubes of data, and cubes within cubes of data Each side of a cube is a dimension.

A cube can be expanded to include other

dimensions.

Trang 28

Representation of Multi-dimensional Data

A cube supports matrix arithmetic.

Multi-dimensional query response time depends

on how many cells have to be added ‘on the fly’

As number of dimensions increases, number of the cube’s cells increases exponentially

Trang 29

Representation of Multi-dimensional Data

However, majority of multi-dimensional queries use summarized, high-level data.

Solution is to pre-aggregate (consolidate) all

logical subtotals and totals along all dimensions

Trang 30

Representation of Multi-dimensional Data

Pre-aggregation is valuable, as typical

dimensions are hierarchical in nature.

– (e.g Time dimension hierarchy - years,

quarters, months, weeks, and days)

Predefined hierarchy allows logical

pre-aggregation and, conversely, allows for a logical

Trang 31

Representation of Multi-dimensional Data

Supports common analytical operations

– Consolidation – Drill-down

– Slicing and dicing

Trang 32

Representation of Multi-dimensional Data

Consolidation - aggregation of data such as

simple ‘roll-ups’ or complex expressions involving inter-related data.

Drill-Down - is the reverse of consolidation and involves displaying the detailed data that

comprises the consolidated data.

Trang 33

Representation of Multi-dimensional Data

Slicing and Dicing - (also called pivoting) refers

to the ability to look at the data from different viewpoints

Trang 34

Representation of Multi-dimensional Data

Can store data in a compressed form by

dynamically selecting physical storage organizations and compression techniques that maximize space utilization

Dense data (that is, data that exists for a high

percentage of cells) can be stored separately from

Trang 35

Representation of Multi-dimensional Data

Ability to omit empty or repetitive cells can

greatly reduce the size of the cube and the amount of processing

Allows analysis of exceptionally large amounts

of data

Trang 36

Representation of Multi-dimensional Data

In summary, pre-aggregation, dimensional

hierarchy, and sparse data management can significantly reduce the size of the cube and the need to calculate values ‘on-the-fly’

Removes need for multi-table joins and provides quick and direct access to arrays of data, thus

Trang 37

OLAP Tools

There are many varieties of OLAP tools

available in the marketplace

This choice has resulted in some confusion with much debate regarding what OLAP actually

means to a potential buyer and in particular what are the available architectures for OLAP tools

Trang 38

Codd’s Rules for OLAP Systems

In 1993, E.F Codd formulated twelve rules as the basis for selecting OLAP tools

Trang 39

Codd’s Rules for OLAP Systems

Multi-dimensional conceptual view

Trang 40

Codd’s rules for OLAP

Dynamic sparse matrix handling

Multi-user support

Unrestricted cross-dimensional operations

Intuitive data manipulation

Flexible reporting

Unlimited dimensions and aggregation levels

Trang 41

Codd’s Rules for OLAP Systems

There are proposals to re-defined or extended

the rules For example to also include

– Comprehensive database management tools – Ability to drill down to detail (source

record) level

– Incremental database refresh – SQL interface to the existing enterprise

environment

Trang 42

Categories of OLAP Tools

OLAP tools are categorized according to the

architecture used to store and process dimensional data

multi-◆ There are four main categories:

– Multi-dimensional OLAP (MOLAP) – Relational OLAP (ROLAP)

Trang 43

Multi-dimensional OLAP (MOLAP)

Use specialized data structures and

multi-dimensional Database Management Systems (MDDBMSs) to organize, navigate, and analyze data

Data is typically aggregated and stored

according to predicted usage to enhance query performance

Trang 44

Multi-dimensional OLAP (MOLAP)

Use array technology and efficient storage

techniques that minimize the disk space requirements through sparse data management

Provides excellent performance when data is

used as designed, and the focus is on data for a specific decision-support application

Trang 45

Multi-dimensional OLAP (MOLAP)

Traditionally, require a tight coupling with the application layer and presentation layer

Recent trends segregate the OLAP from the data structures through the use of published

application programming interfaces (APIs)

Trang 46

Typical Architecture for MOLAP Tools

Trang 47

MOLAP Tools - Development Issues

Underlying data structures are limited in their ability to support multiple subject areas and to provide access to detailed data

Navigation and analysis of data is limited

because the data is designed according to previously determined requirements

Trang 48

MOLAP Tools - Development Issues

MOLAP products require a different set of skills and tools to build and maintain the database,

thus increasing the cost and complexity of support.

Trang 49

Relational OLAP (ROLAP)

Fastest-growing style of OLAP technology due to requirements to analyze ever-increasing

amounts of data and the realization that users cannot store all the data they require in MOLAP databases

Trang 50

Relational OLAP (ROLAP)

Supports RDBMS products using a metadata

layer - avoids need to create a static dimensional data structure - facilitates the creation of multiple multi-dimensional views of the two-dimensional relation

Trang 51

multi-Relational OLAP (ROLAP)

To improve performance, some products use

SQL engines to support the complexity of dimensional analysis, while others recommend,

multi-or require, the use of highly denmulti-ormalized database designs such as the star schema.

Trang 52

Typical Architecture for ROLAP Tools

Trang 53

ROLAP Tools - Development Issues

Performance problems associated with the

processing of complex queries that require multiple passes through the relational data.

Middleware to facilitate the development of

multi-dimensional applications (Software that converts the two-dimensional relation into a multi-dimensional structure).

Trang 54

ROLAP Tools - Development Issues

Development of an option to create persistent, multi-dimensional structures with facilities to assist in the administration of these structures.

Trang 55

Hybrid OLAP (HOLAP)

Provide limited analysis capability, either

directly against RDBMS products, or by using

an intermediate MOLAP server

Deliver selected data directly from the DBMS or via a MOLAP server to the desktop (or local

server) in the form of a datacube, where it is stored, analyzed, and maintained locally.

Trang 56

Hybrid OLAP (HOLAP)

Promoted as being relatively simple to install and administer with reduced cost and maintenance

Trang 57

Typical Architecture for HOLAP Tools

Trang 58

HOLAP Tools - Development Issues

Architecture results in significant data redundancy and may cause problems for networks that support many users

Ability of each user to build a custom datacube may cause a lack of data consistency among users

Trang 59

Desktop OLAP (DOLAP)

Store the OLAP data in client-based files and

support multi-dimensional processing using a client multi-dimensional engine

Requires that relatively small extracts of data are held on client machines They may be distributed

in advance, or created on demand (possibly through the Web)

Trang 60

Desktop OLAP (DOLAP)

As with multi-dimensional databases on the

server, OLAP data may be held on disk or in RAM, however, some DOLAP products allow only read access

Most vendors of DOLAP exploit the power of

desktop PC to perform some, if not most,

Trang 61

multi-Desktop OLAP (DOLAP)

The administration of a DOLAP database is

typically performed by a central server or processing routine that prepares data cubes or sets of data for each user

Once the basic processing is done, each user can then access their portion of the data

Trang 62

Typical Architecture for DOLAP Tools

Trang 63

DOLAP Tools - Development Issues

Provision of appropriate security controls to

support all parts of the DOLAP environment

Since the data is physically extracted from the system, security is generally implemented by limiting the information compiled into each cube Once each cube is uploaded to the user's desktop, all additional meta data becomes the property of the local user

Trang 64

DOLAP Tools - Development Issues

Reduction in the effort involved in deploying and maintaining the DOLAP tools Some DOLAP

vendors now provide a range of alternative ways

of deploying OLAP data such as through e-mail, the Web or using traditional client/server

architecture

Trang 65

OLAP Extensions to SQL

Advantages of SQL include that it is easy to learn, non-procedural, free-format, DBMS-independent, and that it is a recognized international standard

However, major limitation of SQL is the inability

to answer routinely asked business queries such

as computing the percentage change in values between this month and a year ago or to compute moving averages, cumulative sums, and other

statistical functions

Trang 66

OLAP Extensions to SQL

Answer is ANSI adopted a set of OLAP

functions as an extension to SQL to enable these calculations as well as many others that used to

be impossible or even impractical within SQL

IBM and Oracle jointly proposed these

extensions early in 1999 and they now form part

of the current SQL standard, namely SQL: 2003

Trang 67

OLAP Extensions to SQL - RISQL

The extensions are collectively referred to as the

‘OLAP package’ and are described as follows:

– Feature T431, ‘Extended Grouping

capabilities’

– Feature T611, ‘Extended OLAP operators’

Trang 68

Extended Grouping Capabilities

Aggregation is a fundamental part of OLAP To

improve aggregation capabilities the SQL standard provides extensions to the GROUP BY clause such

as the ROLLUP and CUBE functions.

Trang 69

Extended Grouping Capabilities

ROLLUP supports calculations using aggregations

such as SUM, COUNT, MAX, MIN, and AVG at

increasing levels of aggregation, from the most

detailed up to a grand total

CUBE is similar to ROLLUP, enabling a single

statement to calculate all possible combinations of

aggregations CUBE can generate the information

needed in cross-tabulation reports with a single query.

Ngày đăng: 22/10/2014, 10:36

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm