technology and tools Flashcards Preview

EMCDSA > technology and tools > Flashcards

Flashcards in technology and tools Deck (50)
Loading flashcards...
31

What does spark do?

Analytic engine for large scale data processing

32

What is different with sparks data sharing?

It’s in memory and not disk

33

What is greenplum

Open source data platform

34

What is postgresql

Rdbms with object oriented features

35

What is MADlib

Open source library for in database analytics

36

In greenplum what is the intersect operation

Rows from all answer sets

37

In greenplum what is the except operation

Rows from first answer set minus rows from second

38

In greenplum what is the union all operation

Rows from all answer sets with repeating rows

39

In greenplum what is the union operation

Rows from all answer sets minus repeating rows

40

In greenplum what is the group by operation

Group results based on one or more specified columns

41

In greenplum what is the group by with union all operation

Add sub totals and grand totals

42

In greenplum what is the roll up operation

Replaces union all

43

In greenplum what is the cube operation

Creates sub totals of all possible combinations

44

In greenplum what is the grouping function

Distinguishes NULL from summary markers

45

In greenplum what is a window function.

Performs a calculation across a set of rows that are related to the current roe

46

In greenplum and window functions what clause should you apply to specify which data window

OVER

47

In greenplum window functions how would you define window partitions

PARTITION BY

48

what does MAD stand for in MADlib?

magnetic
agile
deep

49

what are the MADlib in-database analytical functions
a)regression
b)classification
c)validation
d)text analysis
e)descriptive analytics
f)clustering and top modelling
g)association rule mining

a)regression
b)classification
c)validation
e)descriptive analytics
f)clustering and top modelling
g)association rule mining

50

what does MADlib do?

creates models without moving data out of DBMS