#### Introduction to R

Tis course is offered within the context of the IMPRS BeSmart Summerschool.- Asynchronous teaching
- Videos can be found here
- Exercises: See below. Participants submit answers each day before exercises start.

- Synchronous teaching
- Daily exercises (16.8.-20.8.), 11:00-12:00.
During synchronous teaching we will use RStudio and the software mentioned below.

- Basics (Exercise on Mon., 16.8., 11:00)
- Installing R, RStudio, Packages
- Data Types, Numbers, Vectors, Matrices, Arrays.

- More on Data Types (Exercise on Tue., 17.8., 11:00)
- Missings, Characters, Factors.
- Lists, Data frames
- Randomness

- Data and Functions (Exercise on Wed., 18.8., 11:00)
- Example datasets.
- Functions.
- Closures.

- Graphs and Files (Exercise on Mon., Thu. 19.8., 11:00)
- Introduction to Graphs.
- Graphs for Univariate and Bivarate Data.
- Files, Reading and Writing Data.

- Control Structures, Structuring Data (Exercise on Fri., 20.8., 11:00)
- Pipes
- Conditions, Loops,Repetition.
- Structuring Data, Grouping, Summarising, Mutating.
- Selecting Variables, Sorting, Joining, Reshaping Data.
- Tables, Regression.

- Documentation for R is
provided via the built in help system but also through the
R Homepage.
Useful are
- The R Guide, Jason Owen (Easy to read, explains R with the help of examples from basic statistics)
- Simple R, John Verzani (Explains R with the help of examples from basic statistics)
- Einführung in R, Günther Sawitzki (In German. Rather compact introduction.)
- Econometrics in R, Grant V. Farnsworth (The introduction to R is rather compact and pragmatic.)
- An Introduction to R, W. N. Venables und D. M. Smith (The focus is more on R as a programming language)
- The R language definition (Concentrates only on R as a programming language.)

- We will use the following packages:
`car, Ecdat, foreign, Hmisc, tidyverse, lattice`

. If, e.g., the command`library(Ecdat)`

generates an error message (`Error in library(Ecdat): There is no package called 'Ecdat'`

), you have to install the package.- Installing packages with Microsoft Windows:
- With RStudio: Use the tab “Install”. Otherwise: Start
`Rgui.exe`

and install packages from the menu`Packages / Install Packages`

). - Installing packages from modern operating systems:
- From within R use the command
`install.packages("Ecdat")`

, e.g., to install the package`Ecdat`

- In the lecture we will use RStudio as a front end.

#### Exercises

Please send your answers to the following questions as an email to `oliver@kirchkamp.de`

. Don’t attach any files to your email.

#### Exercise 1. Submit before Mon., 16.8., 10:30.

Install R and RStudio. Also install the package `Ecdat`

from within RStudio. The command

`help(package="Ecdat")`

gives you a list of the datasets that are provided by the package `Ecdat`

.

Can you find a dataset whose name starts with the same letter as your last name and which contains at least one variable that is a number? If there is no matching dataset, find one with the next letter in the alphabet. After the letter `Z`

, continue with the letter `A`

.

In your answer include you name and the name of the dataset.

How many rows and how many columns does the dataset have?

Choose one variable in the dataset which is a number. With which

`R`

command can you calculate the mean of this variable?

#### Exercise 2. Submit before Tue., 17.8., 10:30.

Following the same strategy as in the previous exercise: Find a dataset whose name starts with the same letter as your last name and that contains either a character variable or a variable that is a factor. If there is no matching dataset, proceed alphabetically, until you have found one that contains either a character variable or a variable that is a factor. Once you have reached the letter `Z`

, continue with the letter `A`

.

In your answer include your name and the name of the dataset.

How many variables in the dataset are characters? How many are factors?

How can you find out whether any variable contains any missing values?

#### Exercise 3. Submit before Wed., 18.8., 10:30.

Now find a dataset that matches your *first* name and that includes at least one variable that is a number.

In your answer, include your name, the name of the dataset, and the name of the variable.

Find the median of this variable.

Read the help page for the function

`quantile`

. How can you use the`quantile`

function to find the median of the above variable?Write a function

`Quantile`

that behaves similar to`quantile`

, except that it has different defaults. The function`Quantile`

should, if the parameter`probs`

is not specified, only return the minimum, the maximum and the median.

#### Exercise 4. Submit before Thu., 19.8., 10:30.

Find again a dataset that matches your *first* name and that includes at least two variables that are numbers.

In your answer, include your name, the name of the dataset, and the name of the two variables.

With which command can you produce a graph that shows the joint distribution of these two variables?

With which command can you produce a graph that shows only the distribution of the first variable?

In your answer, only include the commands, not the graph!

#### Exercise 5. Submit before Fri., 20.8., 10:30.

Find a dataset that matches your *last* name and that includes at least one variables that is a number, and a second variable that has fewer than 12 different values. I will call these (less than 12) values “cases”.

In your answer, include your name, the name of the dataset, and the name of the two variables.

With which command can you provide a table that, for each case, shows the mean, the median, and the difference between the mean and the median.