MW24.3  Lecture Quantitative Economics II (Summer 2018)  Master Program in Economics
There are many interesting topics in quantitative economics. In the past we did Auction theory, Bargaining theory, Information economics, Bayesian Econometrics.Schedule for the course in summer 2018:
In summer 2018 master students can combine two courses from the IMPRS Summerschool for the equivalent of one complete master course. Here is a list of the courses offered during the Summerschool. Here is the program of the Summerschool (there are still some gaps, so please be patient). If you are enrolled in the master program at the Friedrich Schiller Universität and if you want to participate in some courses of the Summerschool and credit these courses for your master, please send a me a brief proposal which courses you want to take and why you want to take them.The Summerschool targets primarily at PhD students. Master students also have to register for this course with the examination office. So you have to register with me and with the examination office!
 Audience
 graduate students
 advanced master students
 Requirements:
 MW 24.1  Empirical Methods or similar.
Workflow
 One is to find the right statistical method for the problem.
 The other is to organise the data and the evaluation in a way such that results can be replicated.
 We interrupt our work for a few days or weeks and want to go back to it quickly.
 We share our work with a collaborator and want her or him to quickly understand what we did and to participate in an efficient way.
 After we sent our paper away to a journal referees might demand small changes in the analysis.
The sad truth is that often researchers find it very hard to replicate the results of their own statistical analysis. During the analysis we make a lot of small decisions, many of them seem obvious when we make them, but when we replicate our work, it turns out that it is not clear which subset of the data we really included, how special cases were coded, how outliers were identified and treated, how bootstraps were run, what was the precise meaning of which variable, and which tests were used with which parameters. Too often it happens that even after spending days and weeks trying out a few dozend of combinations of these parameters we can not replicate what we did a few month ago. If we are lucky, we come perhaps close to the results we published proudly in the past, but we do not get the same resuls. This can be a more than embarassing experience.
The aim of the course is to develop a strategy that helps avoiding this problem. In the course we will discuss strategies that we can use to organise our data and our analysis in a way that allows us even years later to redo our analysis quickly, reliably, with exactly the same results.
An efficient workflow helps us to get back to statistical work quickly after an interruption and also helps to share an analysis with coauthors.
 Introduction
 Aims of statistical data analysis
 Organising your work
 How to separate creativity from chaos?
 Organising ideas in files
 Organising ideas in functions
 Preparing data
 Reading data
 Cleaning data
 Organising data
 Working with data
 Documentation
 Descriptive statistics
 Specific results
 Presenting results
 Weaving and tangling
 Version control
 There is an interesting book on workflow of data analysis, however it is based on Stata: J. Scott Long, The Workflow of Data Analysis Using Stata, Stata Press, 2009.
 Hadley Wickham providies a view on “Tidy Data”: Hadley Wickham; Tidy Data; Journal of Statistical Software, 2014.
 Documentation for R is
provided via the built in help system but also through the
R Homepage.
Useful are
 The R Guide, Jason Owen (Easy to read, explains R with the help of examples from basic statistics)
 Simple R, John Verzani (Explains R with the help of examples from basic statistics)
 Einführung in R, Günther Sawitzki (In German. Rather compact introduction.)
 Econometrics in R, Grant V. Farnsworth (The introduction to R is rather compact and pragmatic.)
 An Introduction to R, W. N. Venables und D. M. Smith (The focus is more on R as a programming language)
 The R language definition (Concentrates only on R as a programming language.)
 We will use the following packages:
car, Ecdat, foreign, Hmisc, knitr, lattice, memisc, tikzDevice, tools, xtable
. If, e.g., the commandlibrary(Ecdat)
generates an error message (Error in library(Ecdat): There is no package called 'Ecdat'
), you have to install the package. Installing packages with Microsoft Windows:
 Start
Rgui.exe
and install packages from the menuPackages / Install Packages
).  Installing packages from advanced operating systems:
 From within R use the command
install.packages("Ecdat")
, e.g., to install the packageEcdat
 In the lecture we will use RStudio as a front end.
 LaTeX
 For weaving and knitting we need LaTeX (e.g. TeX Live or MiKTeX).
 RStudio
 RStudio provides a front end to R, LaTeX, git and svn.
 git
 In the course we
will use git as an example for a version control system. git might be
already installed on your computer. You should
also have a mergetool, e.g.
meld
. (Any of kdiff3, araxis, bc3, codecompare, diffuse, ecmerge, emerge, gvimdiff, opendiff, p4merge, tkdiff, tortoisemerge, vimdiff, xxdiff... would work as well).  Stata
 Stata, unfortunately, does not have an equivalent to Sweave. Still, there are some tools:
Graphs and visualising data
 Motivation
 Graphs and Illustrations can contribute a lot to the success of a scientific paper. In this course we will discuss different ways to use graphs in our research.
 Handout
 Here is a preliminary version of the handout
 Exam
 You can download the exam on Saturday, 11. August 2018, 9:30. Please solve all questions independently and return your answers as a single PDF file until 10:15.
 Example data
 graex1.csv graex2.csv
 Topics
 Introduction
 Properties of good graphs
 Presenting nominal data
 Presenting continuous data
 Using R
 Lattice
 Literature

 William S. Cleveland, “The Elements of Graphing Data”, AT&T Bell Laboratories, New Jersey, 1994.
 Deepayan Sarkar, “Lattice — Multivariate Data Visualization with R”. Springer, NewYork, 2008.
 Edward Tufte, “The Visual Display of Quantitative Information”. Bertrams. 2001.
 Software
 During the second half of part 1 it will help if you can bring your own portable computer to the classes.
You should have an uptodate version of R installed. We will also need the following libraries:
aplpack, car, Ecdat, Hmisc, ks, lattice, latticeExtra, MASS, mgcv, plotrix, pwt, reshape, vcd
.