Statistics for Data Analytics

14314.0400 – BM Data Analytics I – University of Cologne
Author

Sven Otto

Published

February 6, 2024

Welcome to the course!

Statistics for Data Analytics is an introductory graduate-level course in econometrics and statistical inference. We cover basic concepts of mathematical statistics, including estimation and inferential methods in linear models. The goal is to provide the theoretical foundation for data analysis and applied empirical work. Practical applications using the R programming language are also integrated into the course.

Course Materials

Literature

The course is based on the following textbooks:

  • Stock, J.H. and Watson, M.W. (2019). Introduction to Econometrics (Fourth Edition, Global Edition). Pearson.

  • Hansen, B.E. (2022a). Probability and Statistics for Economists. Princeton.

  • Hansen, B.E. (2022b). Econometrics. Princeton.

  • Davidson, R., and MacKinnon, J.G. (2004). Econometric Theory and Methods. Oxford University Press.

Stock and Watson (2019) is available here. To view the book, please activate your Uni Köln VPN connection. For more information on Hansen (2022a, 2022b), please see the ILIAS course. Davidson and MacKinnon (2004) is available for free on the author’s webpage: LINK. Printed versions of the books are available from the university library.

Preparation

You should also be familiar with the basic concepts of matrix algebra. Please consider this refresher:

Crash Course in Matrix Algebra

We will be using the statistical programming language R. Please make sure you have R and RStudio installed before the class. Here you find the installation instructions for the software. If you are a beginner, please consider this short introduction, which contains many valuable resources:

Getting Started with R

Assessment

The course will be graded by a 90-minute written exam. There will be two optional bonus assignments during the lecture period. These assignments will allow you to earn bonus points that will be added to your overall exam score, but they are optional and not required to achieve the maximum score on the exam. More information about the assessment can be found on ILIAS.

Communication

Feel free to use the ILIAS statistics forum to discuss lecture topics and ask questions. Please also let me know if you find any typos. Of course, you can also reach me via e-mail: sven.otto@uni-koeln.de

Important Dates

Bonus assignment 1 Nov 04, 2023 - Nov 17, 2023
Bonus assignment 2 Nov 18, 2023 - Dec 01, 2023
Registration deadline exam 1 Nov 25, 2023
Exam 1 Dec 09, 2023
Registration deadline exam 2 Mar 14, 2024
Exam 2 (alternate date) Mar 28, 2024

Please register for the exam on time. If you miss the registration deadline, you will not be able to take the exam (the Examinations Office is very strict about this). You only need to take one of the two exams to complete the course. The second exam will serve as a make-up exam for those who fail the first exam or do not take the first exam.

Timetable

The course is held on Thursdays from 10:00 to 13:30 and on Fridays from 10:00 to 11:30 in Seminar Room BI on the fourth floor of building 107b (Universitäts- und Stadtbibliothek).

Day Time Lecture/Exercise
Thu, Oct 12 10:00-11:30 Lecture
12:00-13:30 Lecture
Fri, Oct 13 10:00-11:30 Lecture
Thu, Oct 19 10:00-11:30 Exercises
12:00-13:30 Lecture
Fri, Oct 20 10:00-11:30 Lecture
Thu, Oct 26 10:00-11:30 Exercises
12:00-13:30 Lecture
Fri, Oct 27 10:00-11:30 Lecture
Thu, Nov 02 10:00-11:30 Exercises
12:00-13:30 Lecture
Fri, Nov 03 10:00-11:30 Lecture
Thu, Nov 09 10:00-11:30 Exercises
12:00-13:30 Lecture
Fri, Nov 10 10:00-11:30 Lecture
Thu, Nov 16 10:00-11:30 Exercises
12:00-13:30 Lecture
Fri, Nov 17 10:00-11:30 Lecture
Thu, Nov 23 10:00-11:30 Exercises
12:00-13:30 Lecture
Fri, Nov 24 10:00-11:30 Lecture
Thu, Nov 30 10:00-13:30 Lecture/Q&A

R-Packages

To run the R code of the lecture script, you will need to install some additional packages (only a few, since we will mostly be using base R).

install.packages(c("sandwich", "lmtest", "tidyverse", "moments"))

To apply inferential methods that are not available in base R packages, we will use sandwich, lmtest, and moments. The tidyverse will be useful for data management and visualization. To install the R package that contains the datasets for the lecture please follow the instructions in the ILIAS course.

Some further datasets are contained in my package teachingdata, which is available in a GitHub repository:

install.packages("remotes")
remotes::install_github("ottosven/teachingdata")