f Flashcards

1
Q

Aggregation:

A

Used to model a relationship involving a relationship set

Allows us to treat a relationship set as an entity set for purposes of participation in other relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Lowest-level ER model:

A

physical data model (PDM) – most detailed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

is ER design subjective

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Entity:

A

real-world object, distinguishable from other objects. An entity is described as using a set of attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Reasons to use ISA:

A

Add descriptive attributes specific to a subclass (ex. Not appropriate for all entities in the superclass)

Identify entities that participate in a particular relationship (ex. Not all superclass entities participate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Weak Entity:

A

Weak entity can be identified uniquely only by considering the primary key of another (owner) entity.

Weak entity set must have total participation in this identifying relationship set.

Weak entities only have a “partial key” (dashed underline)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The steps:

A
  1. Identify the entities
  2. Identify the relationships between the entities
  3. Determine whether a relationship is 1:1 1:M or M:N
  4. Determine whether participation in a relationship mandatory (At least one instance or full participation) or not
  5. Identify weak entities
  6. Identify ISA hierarchies and aggregations
  7. Consider possible refinements: should a concept be modeled as an entity or attribute? Should a concept be modeled as an entity or a relationship? Identifying relationships? Binary or ternary?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mid-level ER model:

A

Logical data model (LDM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In translation a relationship set to a relation, attributes of the relation must include(3):

A
  • keys for each participating entity set (as foreign keys)
  • this set of attributes forms a superkey for the relation
  • all descriptive attributes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Owner entity set and weak entity set must participate in

A

Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Entity set:

A

a connection of similar entities (ex. All employees). All entities in an entity set have the same set of attributes, each entity set has a key (underlined), each attribute has a domain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Locks and transaction design:

A

When you change a row, no one else can modify it until you issue a COMMIT, try not to hold locks for too long since it will slow down other sessions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Two biggest causes of contention are

A

locks and latches (or mutexes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Types of contention (3):

A

locks
latches and mutex
buffer contention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Two ways to design transaction locking?

A

Pessimistic and Optimistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Contention prevents:

A

Contention prevents the database from working on all of the requests that are outstanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

High-level ER model:

A

Conceptual data model (CDM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Contention is another word for

A

bottleneck

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Pessimistic locking:

A

works best if you think someone else is going to “grab” your row before you are finished

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Hints:

A

You can add a hint to your SQL to change the execution plan

SELECT /* index(index name) */

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Performance problems are often seen as

A

unacceptable response time or throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Are indexes good for small or large amount of the tables?

A

Indexes are only good for getting small amount of the table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Tuning involves

A

proactive monitoring and bottleneck elimination, providing room for system scalability (process more workload)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Partitioning:

A

Split table up to make them smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Execution plan:

A

Called explain plan or query plan. Tells you how the optimizer will execute your SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Transaction design:

A

Minimize lock duration using pessimistic and optimistic locking strategies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Histograms:

A

Histogram allows the optimizer to understand how the data is being distributed and make the best decision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Network overhead:

A

Avoid unnecessary network round trips.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Optimistic locking:

A

works best if you think no one else will “grab” the row

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Denormalize:

A

merge tables to avoid joins, create “materialized views: to avoid big grouping or filtering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

SQL tuning

A

optimizer determines execution plan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

does a concatenated index have 1 or many columns?

A

Concatenated index has more than one column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Two ways to reduce round trips:

A

Use the “Array” interface in your program code

Use the stored procedures for complex interactions with the database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Should you create an index on all columns?

A

Do not create an index on every column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

what do you violate in Denormalization?

A

In denormalization you violate 3NF to improve performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Configure Memory:

A

use memory to avoid I/O, operations to read from memory, operations to share memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

IT Service Management (ITSM) can be used to

A

IT Service Management (ITSM) can be used to identify and maintain service levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

When to not use NULLS?

A

Do not use NULL if you will search for NULLS use something that can be indexed (N/A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Reduce contention:

A

avoid contention for locks and latches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Instance tuning

A

avoid bottlenecks in initial design and later monitor performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Indexes:

A

Create indexes on groups of columns that are queried together
Indexes speed up queries but make DML slower

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Nulls:

A

Using nulls has significant performance implications (nulls do not take any space, nulls cannot be indexed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Performance management tools do what?

A

Monitor, estimate, plan capacity, analyze, reorganize, optimize, cache, compress, sort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What can you do for optimal performance(4)?

A

Tune the application
Reduce contention
Configure Memory
Tune I/O

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Tune the application:

A

choose the best data model, reduce the load on the database, tune the SQL statements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

3 Things we can do to tune our application:

A
  1. Structure the tables in a way that the database would work better
  2. Tune the application code (Java, C++ etc.)
  3. Tune the SQL statements
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Should you put a high or low load on a database and why?

A

Do not put much load on the database

When the application puts too much load on the database, and the performance decrease, it is not the database fault

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

When do subtypes occur?

A

Subtypes can occur when you are modelling things that are almost the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Tune I/O:

A

use fast disks, use RAID (o + 1), use SSDs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Performance management tools(7):

A
  • Oracle performance method – eliminating bottlenecks and developing efficient SQL statements
  • Database self-monitoring – sends alerts to notify of impending problem using expected values for comparison
  • AWR (Automatic Workload Repository) – performance history
  • AWR baseline-statistics with DB performing well at peak load
  • Adaptive threshold-warning and critical alert thresholds
  • ADDM uses AWR statistics to diagnose performance
  • OEM (Oracle Enterprise Manager) – GUI for maintenance
51
Q

Why start in 3NF?

A

Starting point is 3NF, it takes all redundancy in the data and determines the PKs. Every column in a table should be identified by: the PK, all of the PK, nothing but the PK.

52
Q

usually locking problems are due to :

A

usually locking problems are due to application locks, sometimes internal locks can cause problems

53
Q

Database buffers cache table and index data to avoid what?

A

Database buffers cache table and index data to avoid reading data from the disk

54
Q

What do locks prevent?

A

Prevent 2 sessions from changing table data at the same time – this avoids “lost updates”

55
Q

Database might have to perform a sort if:

A
  • SQL contains an ORDER BY or GROUP BY
  • The two tables are being joined without an index, both tables are sorted and the results merged (called a SORT-MERGE join)
56
Q

what are soft areas and what do they allow?

A

Soft areas (Called PGA in Oracle) allow sorting and hash structures to be maintained in memory; otherwise they would be written to a temporary

57
Q

Latches and Mutex:

A

Latches are very light weight locks that protect memory instead of tables

58
Q

Buffer cache ‘hit rate’:

A

When asked for data, the database first looks for it in the memory buffers.

If the data is found in memory it is called a “hit”, otherwise the data must be read from the disk (Called a “miss”)

59
Q

Buffer contention:

A

Memory itself can become a problem; no one can get the memory they want.

60
Q

If there is not enough memory to do the hash table or a sort in memory, then the database will…?

A

If there is not enough memory to do the hash table or a sort in memory, then the database will read and write data to a temporary file group

61
Q

What do database buffers improve?

A

Database buffers improve performance by caching data in memory

62
Q

Hash join

A

Hash joins are more efficient alternative to SORT-MERGE. A hash table is built on one of the tables and acts like an “on the fly” index

63
Q

What happens when a buffer is modified?

A

When buffers are modified they are called dirty; these have to be written to disk

64
Q

Hit rate ratio?

A

The ratio (hit/(hits + misses)) is called the ‘hit rate’

65
Q

What do latches do?

A

Latches are like locks, but instead of protecting table rows, they protect memory (buffers)

66
Q

Disk IO is the fastest or slowest part of the database system?

A

Disk IO is the slowest part of the database system, so it is critical to performance

67
Q

What are locking problems caused by?

A

Most locking problems are caused by application code, optimistic locking strategy is often the solution

There are system locks that can cause problems, these are rare and database specific

68
Q

Latency

A

Latency is the time taken to perform a single IO

69
Q

What happens when all database buffers are dirty?

A

When all the blocks are dirty then sessions have to wait for the buffers to be written before new data can be read

70
Q

Throughput

A

Throughput is the number of operations over time (IO/second)

71
Q

Difference between a latch and a lock?

A

for a latch:
If two sessions try to access the same area of memory, then one will wait

Instead of “sleeping” (like a lock) the waiting session will “spin” on the CPU for a very short time

72
Q

High latency means

A

High latency means you are overloading the disk

73
Q

Disk latency increases with

A

throughput, disk fill

74
Q

To avoid overloading disks

A

To avoid overloading disks, we combine multiple disks into an “array”. The array can then support higher amounts of disk IO

75
Q

To get best performance disk should be

A

sparsely populate, under only moderate load

76
Q

what can the multiple disk array protect against?

A

The array can also protect the disk data loss by storing multiple copies of data

77
Q

what is the “hockey stick” curve?

A

When the disk is overloaded, latency goes up and throughput stalls (called the “hockey stick” curve)

78
Q

“RAID” levels describe

A

“RAID” levels describe the type of array. RAID levels 1,0 and 5 are the most frequently used

79
Q

RAID FIVE:

A

Distributes data across disks like RAID 0

Creates a “parity” block for every data block that can be used to recover data if the disk fails

80
Q

why are SSD a poor choice?

A

But they are a poor choice for data that does not get accessed very often

81
Q

RAID 5 vs RAID 0+1

A

RAID 5 requires less disks that RAID 0+1 so it’s cheaper (but also much slower when writing)

82
Q

Using SSD can be the best way to provide

A

very high IP rates

83
Q

RAID 0+1:

A

Stripping and mirroring together

Best performance

Protection against data loss

More expensive (more disks) than RAID 5

Best solution for database files

84
Q

raid 0 is also called

A

Also called striping

85
Q

Pitfalls

A

use of information as directive rather than indicator of potential problem

86
Q

RAID ZERO:

A

Data is spread across multiple disks to distribute IO evenly

Good performance but no protection against data loss

87
Q

RAID ONE is also called

A

Also called mirroring

88
Q

RAID ONE:

A

Data is duplicated on two or more disks

Protects against data loss, but does not spread the IO across disks

89
Q

Data governance program oversees the management of (7)?

A

Data governance program oversees the management of the quality, maintainability, availability, usability, integrity, scalability and security of enterprise data

90
Q

Types of changes(5):

A
  • DBMS software – migrations, procedures
  • Hardware configuration
  • Logical and physical design
  • Applications
  • Physical database structures
91
Q

Management visibility:(4)

A

Impact
Prosecution
Cost
Durability

92
Q

Impact:

A

upper-level management is keenly aware of the need to comply

93
Q

who requests changes to databases?

A

DBA does not request change (programmers, application owners, business owners do)

94
Q

who implements changes to databases?

A

DBA carries out most database changes

95
Q

Change management requirements:(8)

A
  • Proactivity
  • Intelligence
  • Analysis
  • Automation
  • Standardization of procedure
  • Reliable and predictable process
  • Availability
  • Quick and efficient delivery
96
Q

Prosecution:

A

can result in huge fines and imprisonment

97
Q

Cost:

A

can be significant but so can the cost of non-compliance

98
Q

Durability:

A

increasing regulation – increasing time, effort and capital will be spent on compliance

99
Q

Collaborative approach:(3)

A

Business
legal
IT

100
Q

Nulling out:

A

removes the sensitive data by deleting it

101
Q

Business:

A

must understand the legal requirements imposed on their data and systems as dictated in regulations

102
Q

Encryption:

A

scrambles the data algorithmically. Thi5s technique will not produce realistic looking data and can make the data larger

103
Q

Compliance-related tasks that impact DBA role include:(5)

A
  • Metadata management and data quality
  • Database and data access auditing
  • Data masking and obfuscation
  • Long-term data retention and database archiving
  • Closer tracking of traditional DBA tasks
104
Q

Number and data variance:

A

varies the existing values in a specified range in order to obfuscate them

105
Q

Five techniques that can be used to audit database access:

A
  1. Adding columns to tables – not a good idea
  2. DBMS traces – ISV offering is better
  3. Log based - missing read activity
  4. Network sniffing – missing server requests
  5. Capture requests at the server
106
Q

Shuffling:

A

uses the existing data and moves the values between rows in such a way that the no values are present in their original rows

107
Q

Legal:

A

must be involved to interpret the legal language of the regulations and ensure that the business is taking proper steps to protect itself

108
Q

Substitution:

A

replaces existing data with random values from a pre-prepared data set

109
Q

Data masking:

A

Data masking is the process of protecting sensitive and personally identifiable information (PII) in non-production databases from inappropriate visibility

110
Q

IT:

A

must be involved to implement the policies and procedures to enact the technology to support the regulatory mandates

111
Q

Table-to-table synchronization:

A

masks data assuring that the results are referentially intact.

112
Q

Data masking techniques:(6)

A
Substitution
Shuffling
Number and data variance
Encryption
Nulling out
Table-to-table synchronization
113
Q

OLAP

A

online analytical processing

114
Q

Data warehouse

A

a collection of integrated, non-volatile, time-variant, subject oriented databases designed to support the DSS function.

115
Q

DDS data

A

differs from transactional operational data in timespan, granularity and dimension

116
Q

DDS

A

data distrubution system

117
Q

VLDB

A

very large databases

118
Q

Data warehouses are designed for

A

Data warehouses are designed for analytical processing.

119
Q

OLTP

A

online transaction processing

120
Q

Life cycle of data:

A
  • Create
  • Operational (completing business transactions)
  • Reference (reporting or queries)
  • Archive (compliance and business protection)
  • discard
121
Q

Data warehouse contains:

A

The data warehouse contains atomic data and lightly summarized data.

122
Q

Business Intelligence:

A

Comprehensive, cohesive, integrated tools and processes.

123
Q

BI Actions include:

A
  • Drilling up/down hierarchies
  • Comparing aggregate values
  • Parallel execution