Oct 2020 / Data analysis with R and Python

Part of the Scientific Computing in Practice lecture series at Aalto University.

Audience: Researchers who are using or will soon be using R and Python for data analysis, who know how to program with these languages, but do not necessarily know what are the best practices for data analysis. The course material is available in both R and Python, but this is not a course on the basics of scientific programming. If you wish to prep up your scientific programming skills, we recommend taking our Sept 2020 / Python for Scientific Computing-course.

About the course: We provide a practical introduction and advice for data analysis in R and Python. We will learn how you should organize your data for efficient data analysis, how to analyze the data, how to split your data/models based on your intended data analysis task, and how to visualize the results you obtain from this data analysis. The course is suited for people who are starting on doing data analysis and would like to start on a good workflow. The course material can be done with either R or Python.

Course consists of four three hour sessions that will be done online via zoom. In these sessions we’ll learn of core concepts of data analysis. We will do some exercises during the sessions, but most of the exercises will be available throught a GitHub repository (released a bit later) and are meant to be done between the sessions. We will go through the solutions at the start of each session.

Lecturer: Simo Tuomisto, M. Sc., Aalto Scientific Computing / Department of Computer Science

Time, date:

  • Tue 5.10, 12:00-15:00
  • Wed 7.10, 12:00-15:00
  • Mon 12.10, 12:00-15:00
  • Wed 14.10, 12:00-15:00

Place: online, Zoom link to be posted to the registered participants

Cost: Free of charge for FGCI consortium members including Aalto employees and students.

Registration: registration is open

Credits: Credits available for the Aalto students and course certificate can be provided on request for the outsiders. Full course hours correspond roughly to 1 ECTS. Students who wish to get a certificate should hand in the special assignment and participate to at least 3 of 4 lectures.

Other comments:

Additional course info at: simo.tuomisto -at- aalto.fi

Course preparation

This will be updated closer to the course.

Course material

This is the preliminary plan for the course:

  • Day 1: Understanding data science workflows
  • Day 2: Data ingestion, tidy data and efficient data formats
  • Day 3: Running models
  • Day 4: Scaling your analysis by splitting data and computations

Homework

This will be updated closer to the course.