Lecture 8 - Big Data Concepts Flashcards Preview

Business Intelligence / Data Mining > Lecture 8 - Big Data Concepts > Flashcards

Flashcards in Lecture 8 - Big Data Concepts Deck (21)
Loading flashcards...
1
Q

Vs that define Big Data

A
Volume
Variety
Velocity
Veracity
Variability
Value
2
Q

Challenges of Big Data?

A

Effectively and efficiently capturing, storing, and analyzing big data

3
Q

Critical Success Factors of the Big Data?

A
A clear business need
Strong, committed sponsorship
Alignment between the business and IT strategy
A fact-based decision-making culture
A strong data infrastructure
The right analytics tools
Right people with skills
4
Q

Enables of Big Data Analytics?

A

In-memory analytics
In-database analytics
Grid computing and MPP
Appliances

5
Q

In-memory analytics?

A

Storing and processing the complete data set in RAM

6
Q

In -Database analytics?

A

Placing analytic procedures close to where data is stored

7
Q

Grid computing and MPP?

A

Use of many machines and processors in parallel

8
Q

Appliances?

A

Combining hardware, software, and storage in a single unit for performance and stability

9
Q

Challenges of Big Data Analytics?

A
Data volume
Data integration
Processing capabilities
Data governance
Skill avaiability
Solution cost
10
Q

Business Problems addressed by Big data analytics?

A

Process efficiency and cost reduction
Brand management
Revenue maximization
Enhanced customer experience

11
Q

MapReduce?

A

Distributes the processing of very large multi-structured data files across a large cluster of ordinary machines

12
Q

Goal of MapReduce?

A

Achieving high performance with simple computers

13
Q

Example tasks of MapReduce?

A

Indexing Web for search
Graph analysis
Text analysis
Machine learning

14
Q

Hadoop?

A

Is an open source framework for storing and analyzing massive amounts of distributed, unstructured data

15
Q

How does Hadoop work?

A

Access unstructured and semi structured data
Break the data up into parts
Each part is replicated multiple times and loaded into the file system for replication and failsafe processing
A node acts as the facilitator and another as job tracker
Jobs are distributed to the clients and once completed the results are collected and aggregated using MapReduce

16
Q

Hadoop Technical Components?

A
HDFS
Primary facilitator
Secondary Node
Job Tracker
Slave Nodes
NoSQL
17
Q

Big Data Vendors?

A

Software
Hardware
Service

18
Q

Stream Analytics Applications?

A
e-commerce
Telecommunication
Law Enforcement and Cyber Security
Power Industry
Financial Services
Health Services
Government
19
Q

OLTP?

A

Online Transaction Processing (DBMSs)

20
Q

OLAP?

A

Online Analytical Processing

Data warehousing

21
Q

RTAP?

A

Real-Time Analytics Processing

Big Data Architecture & technology