data
Course Content
Short summary
This course introduces students to the fundamental practices of Data Science in the context of economic
research. The course covers basic theoretical concepts and practical skills in gathering, preparing/cleaning,
visualizing, storing, and analyzing digital data for research purposes.
Description
The increasing abundance of digital data covering every-day human activities offers opportunities and poses
challenges for empirical research in economics and more broadly in the social sciences at large. Data used
in economics come more and more often from novel digital sources (e.g, social media, web applications, or
sensors), in diverse formats (e.g., JSON, unstructured text), and in large quantities. In order to effectively
and efficiently engage with these developments, economists need a basic understanding of data technologies
and practical skills in working with digital data.
This course covers basic theoretical concepts and practical skills in (automatically) gathering, preparing,
visualizing, and storing digital data for research purposes. It thus covers the crucial first steps underlying
empirical research projects. These steps are often rather neglected in traditional social science methodology
but are of great relevance in the age of Big Data; this course aims to fill this gap and thereby aims to exploit
synergies with other methodology courses such as: Statistics and Empirical Economic Research. Hands-on
exercises and case studies from current real-world research projects are meant to deepen the taught concepts
and train students in the basics of programming with data.
The course covers both theoretical concepts in handling digital data as well as practical hands-on exercises
focusing on different data structures and data formats (CSV, HTML, JSON). All exercises are based on freely
available open-source-tools (R, RStudio, Atom). Students are expected to install these tools and work with
them on their own machines. In the first part of the course, students learn about the relevance and challenges
of Big Data for research in economics and related fields, by introducing students to basic data formats and
how their use in every-day life has evolved in recent years (with a particular focus on the spread of the
Internet and online data). Based on this, the second part of the course introduces concepts and practices
to gather and prepare digital data from various sources. In this part, students acquire basic programming
skills with R in order to apply these practices with real-world datasets. The last part of the course focuses on
analysis and visualization as well as storage and documentation of (relatively) large data sets and discusses
the implications of the contents covered in the course for econometric research and applied data science.
The structure of the course offers the opportunity to invite guest speakers (in the second and third part of
the course) who can give insights into social science research with Big Data and/or applied Data Science in
the industry.
1
Course Goals
The main goal of the course is to enable students to handle digital data for analysis/research purposes in
economics (with a particular focus on unusual and large data sets from various sources). Students get familiar
with best practices to gather, clean, and store digital data for research purposes. They are capable of planning
and managing the first steps of an empirical research project based on digital data, preceding the actual
econometric analyses. Finally, students acquire basic programming skills with R in the context of real-world
data sets.
Course Objectives
• Students will know the basic concepts of data technologies/data structures.
• Students will understand the basics of computer code and data storage.
• Students will know how to apply the relevant R packages and programming practices to effectively and
efficiently parse, filter, clean, and store digital data from various sources.
Comments
Post a Comment