Book - Chapter 3 basic analytics in r Flashcards Preview

EMCDSA > Book - Chapter 3 basic analytics in r > Flashcards

Flashcards in Book - Chapter 3 basic analytics in r Deck (69)
Loading flashcards...
1

What is the function to import data

Read.csv

2

What does the head function do

Examines the imported dataset

3

What does the summary function do

Provide some descriptive statistics, such as mean and medium, for each data column

4

When referring to a column in a dataset what symbol should you use

The $

5

How would you plot linear regression

Lm

6

What does our software use

Commandline interface

7

How do you set a working directory

Set WD

8

What are the categorical/qualitative attribute types

Nominal and ordinal

9

What are the numeric/quantitative attribute types

Interval and ratio

10

What are nominal data types

ZIP Codes
nationality
street names
gender
employee ID number
true or false

11

What is an ordinal data type

Ordered names for example
quality of diamonds
academic grades
magnitude of earthquakes

12

What is interval data types

Numeric with no true zero for example Celsius or Fahrenheit

13

What is ratio data type

Numeric with a true zero for example age or temperature in Kelvin

14

What is a vector

Set of values of the same data type.

15

What can you use to create vectors

The combined function

16

What dimension is a vector

They are dimensionless

17

What is a 2dimensional array

Matrix

18

What is an array

N dimensional set of homogenous data type values

19

What does the function nrow and nCol do

Define the number of rows and columns

20

What is a DataFrame

Like a spreadsheet and list but all columns are the same length

21

Can data frames stored different data types

Yes

22

What is the list

A list is a collection of vectors and to be different lengths

23

What is a factor

A set of categorical variables.
Fix set of values and use integer code to represent different values

24

What does variance mean

The distance from means squared

25

What does standard deviation mean

The square root of variance

26

What does ranged mean

Minimum to maximum

27

What is interquartile range

25% to 75% of the size order data

28

Why do we visualise

To get a sense of the data

29

What should we visualise

Mean versus median. Standard deviation. Quantiles. Correlations between variables

30

What does anscombes quartet do.

Illustrates the importance of visualising data. Uses for data sets. Each day is to set is plotted as a scatterplot and then fitted with lines with the results of applying linear regression