Lecture 4 Flashcards Preview

Business Intelligence / Data Mining > Lecture 4 > Flashcards

Flashcards in Lecture 4 Deck (59)
Loading flashcards...
1
Q

OLTP?

A

Online Transaction Processing

2
Q

OLAP

A

Online Analytical Processing

3
Q

Where is OLTP used?

A

E-commerce
Banking
Airline Reservations

4
Q

OLTP (User facing)

A

Real-Time
Low latency
Highly-concurrent

5
Q

OLTP (Tasks)

A

Relatively small set of standard transactional queries

6
Q

OLTP (Data access pattern)

A

Random leads
Updates
Writes

7
Q

Where is OLAP used?

A

Business intelligence

Data mining

8
Q

OLAP (Back-end processing)

A

Batch workloads

Less concurrency

9
Q

OLAP (Tasks)

A

Complex analytical queries

Ad-hoc

10
Q

OLAP (Data access pattern)

A

Table scans

Large amounts of data involved per query

11
Q

OLAP Example?

A

Amazon analyzes purchases by its customers to come up with an individual screen with products of likely interest to the customer

12
Q

Downsides of co-existing OLTP and OLAP workloads?

A

Poor memory management
Conflicting data access patterns
Variable latency

13
Q

Solution for having both OLTP and OLAP with no downsides?

A

Separate databases
User-facing OLTP database for high-volume transactions
Data warehouse for OLAP workloads

14
Q

How to connect OLTP with OLAP?

A

ETL

15
Q

What does ETL stand for?

A

Extract
Transform
Load

16
Q

`What is Data Warehouse?

A

A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format

17
Q

Characteristics of DWs?

A
Subject oriented
Integrated
Time-variant
Nonvolatile
Summarized
Not normalized
Metadata
Web based
Client/server
18
Q

What is Data Mart?

A

A departmental small-scale DW that stores only limited/relevant data

19
Q

Two types of Data Mart?

A

Dependent data mart

Independent data mart

20
Q

Dependent data mart?

A

A subset that is created directly from a data warehouse

21
Q

Independent data mart?

A

A small data warehouse designed for a strategic business unit or a department

22
Q

Other DW Components?

A

ODS
Oper marts
EDW
Metadata

23
Q

ODS?

A

Operational Data Stores

A type of database often used as an interim area for a data warehouse

24
Q

Oper Marts?

A

An operational data mart

25
Q

EDW?

A

Enterprise Data Warehouse

A data warehouse for the enterprise

26
Q

Metadata?

A

Data about data

27
Q

Types of DW Architectures?

A

Three-Tier Architecture

Two-Tier Architecture

28
Q

Three-Tier Architecture?

A
  1. Data acquisition software (backend)
  2. The DW that contains the data and the software
  3. Client (Front-end) software that allows users to access and analyze data from the warehouse
29
Q

Two-Tier Architecture?

A

First two tiers in 3Tier architecture are combined into one

30
Q

Example of Three-Tier Architecture?

A

Client Workstation
Application Server
Database Server

31
Q

Example of Two-Tier Architecture?

A

Client workstation

Application and database server

32
Q

What does Web-based DW Architecture contain?

A

Client (Web browser)
Web server -> Web pages
Application server
Data warehouse

33
Q

Data warehouse development approaches?

A

Inmon Model

Kimball Model

34
Q

Inmon Model?

A

EDW approach (top-down)

35
Q

Kimball Model?

A

Data mart approach (bottom-up)

36
Q

Characteristics of DM approach?

A

One subject area
Months to develop
Costs 10k to 100k plus
Low to medium difficulty
Data prerequisite is common within the business area
Sources include only some operational and external systems
Size can range from megabytes to gigabytes
Time horizon - near current and historical data
Data transformation - low to medium

37
Q

Characteristics of EDW approach?

A

Several subject areas
Years to develop
Costs 1mil plus
Hard difficulty
Data prerequisite is common across the enterprise
Sources include many operational and external systems
Size can range from gigabytes to petabytes
Time horizon - historical data
Data transformation - high

38
Q

Tables in DW?

A

Fact table

Dimensional tables

39
Q

Fact table?

A

A very large accumulation of facts such as sales

40
Q

Dimension tables?

A

Smaller, generally static information about the entities involved in the facts

41
Q

Representation of Data in DW?

A

Dimensional Modelling
Star Schema
Snowflakes Schema

42
Q

Dimensional Modelling?

A

A retrieval-based system that supports high-volume query access

43
Q

Star Schema?

A

The most commonly used and the simplest style of dimensional modeling
Contain a fact table surrounded by and connected to several dimension tables

44
Q

Snowflakes Schema?

A

An extension of star schema where the diagram resembles a snowflake in shape

45
Q

What is multidimensionality?

A

The ability to organize, present, and analyze data by several dimensions, such as sales by region, by product

46
Q

Multidimensional presentation contents?

A

Dimensions
Measures
Time

47
Q

Dimensions?

A
Product
Salespeople
Market
Segments
Business units
48
Q

Measures?

A

Money
Sales volume
Head count
Inventory profit

49
Q

Time?

A

Daily
Weekly
Monthly

50
Q

Two classes of fact-table attributes?

A

Dimension attributes
- The key of a dimension table
Dependent attributes
- A value determined by the dimension attributes of the tuple

51
Q

Warehouse Data Models?

A

Relations
Stars and snowflakes
Cubes

52
Q

Warehouse Operators?

A

Slice and dice
Roll-up, drill down
Pivoting
Other

53
Q

Slice?

A

A subset of a dimensional array

54
Q

Dice?

A

A slice on more than two dimensions

55
Q

Drill down/Up?

A

Navigating among levels of data ranging from the most summarized (up) to the most detailed (down)

56
Q

Roll Up?

A

Computing all of the data relationships for one or more dimensions

57
Q

Pivot?

A

Used to change the dimensional orientation of a report or an ad hoc query-page display

58
Q

ROLAP?

A

Relational Online Analytical Processing

59
Q

MOLAP?

A

Multi-Dimensional Online Analytical Processing