Flashcards in Initial analysis of the data Deck (48)

Loading flashcards...

1

## What are all R commands?

### Functions

2

## how do you get data into R

###
Read.table()

read.csv()

3

## how do you get data out of R

### write.csv()

4

## what is nominal data?

### names of things

5

## what is ordinal data?

### ordered names

6

## what is interval data?

### numeric with no true zero (Celsius)

7

## what is ratio data?

### numeric with true zero (kelvin)

8

## which 2 data classifications are categorical or discreet?

### nominal and ordinal

9

## which 2 data classifications are continuous variables?

### interval and ratio

10

## what is a number?

### can have decimals

11

## what is a integer?

### whole number

12

## what is a character?

### not a number

13

## what is a vector?

### set of values of the same data (combine function c() )

14

## what is a list?

### collection of different vectors or other data structures

15

## what is a factor?

###
categorical variable

fixed set of values

16

## what are arrays?

### n-dimensional homogeneous data types

17

## what are matrices?

### 2D and numeric

18

## what is a data frame?

### a list but all component vectors are same length

19

## what is the R code for viewing the data?

###
head()

tail()

20

## what is the r code for viewing a summary of the data?

### summary()

21

## what is the r code for computing basic statistics?

###
sd()

var()

range()

IQR()

22

## What is the r code for the correlation?

### cor()

23

## what does visualisation give you?

### more holistic picture of the data

24

## what are summary statistics?

###
mean vs median

standard dev

quartiles

correlations

25

## what is Anscombe's Quartet?

### 4 sets of data based on standard statistics

26

## what does hist() mean in R?

### Plot a histogram

27

## what do missing values suggest?

### dirty data

28

## what is the best first visualisation of 2 variables?

### scatter plots

29

## what is a box and whisker plot?

### a plot that shows the centre box of the data (50%)

30