data

Course Content Short summary This course introduces students to the fundamental practices of Data Science in the context of economic research. The course covers basic theoretical concepts and practical skills in gathering, preparing/cleaning, visualizing, storing, and analyzing digital data for research purposes. Description The increasing abundance of digital data covering every-day human activities offers opportunities and poses challenges for empirical research in economics and more broadly in the social sciences at large. Data used in economics come more and more often from novel digital sources (e.g, social media, web applications, or sensors), in diverse formats (e.g., JSON, unstructured text), and in large quantities. In order to effectively and efficiently engage with these developments, economists need a basic understanding of data technologies and practical skills in working with digital data. This course covers basic theoretical concepts and practical skills in (automatically) gathering, preparing, visualizing, and storing digital data for research purposes. It thus covers the crucial first steps underlying empirical research projects. These steps are often rather neglected in traditional social science methodology but are of great relevance in the age of Big Data; this course aims to fill this gap and thereby aims to exploit synergies with other methodology courses such as: Statistics and Empirical Economic Research. Hands-on exercises and case studies from current real-world research projects are meant to deepen the taught concepts and train students in the basics of programming with data. The course covers both theoretical concepts in handling digital data as well as practical hands-on exercises focusing on different data structures and data formats (CSV, HTML, JSON). All exercises are based on freely available open-source-tools (R, RStudio, Atom). Students are expected to install these tools and work with them on their own machines. In the first part of the course, students learn about the relevance and challenges of Big Data for research in economics and related fields, by introducing students to basic data formats and how their use in every-day life has evolved in recent years (with a particular focus on the spread of the Internet and online data). Based on this, the second part of the course introduces concepts and practices to gather and prepare digital data from various sources. In this part, students acquire basic programming skills with R in order to apply these practices with real-world datasets. The last part of the course focuses on analysis and visualization as well as storage and documentation of (relatively) large data sets and discusses the implications of the contents covered in the course for econometric research and applied data science. The structure of the course offers the opportunity to invite guest speakers (in the second and third part of the course) who can give insights into social science research with Big Data and/or applied Data Science in the industry. 1 Course Goals The main goal of the course is to enable students to handle digital data for analysis/research purposes in economics (with a particular focus on unusual and large data sets from various sources). Students get familiar with best practices to gather, clean, and store digital data for research purposes. They are capable of planning and managing the first steps of an empirical research project based on digital data, preceding the actual econometric analyses. Finally, students acquire basic programming skills with R in the context of real-world data sets. Course Objectives • Students will know the basic concepts of data technologies/data structures. • Students will understand the basics of computer code and data storage. • Students will know how to apply the relevant R packages and programming practices to effectively and efficiently parse, filter, clean, and store digital data from various sources.

Comments

Popular posts from this blog

ft

gillian tett 1