Skip to content

My personal repository for all of my work for the Battling the Curse of Dimensionality course at UU in the fall of 2021.

Notifications You must be signed in to change notification settings

Kevin-Patyk/Battling-the-Curse-of-Dimensionality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Battling the Curse of Dimensionality

This repository is where I store all of my R scripts and HTML files from practicals and assignments for this course at Utrecht University.

Contents

Each folder contains an .Rmd file and a corresponding R Markdown HTML file. If there are any images or data associated with the code, they will be in the Data and Images folders.

Course Description

The ever-growing influx of data allows us to develop, interpret and apply an increasing set of learning techniques. However, with this increase in data comes a challenge: how to make sense of the data and identify the components that really matter in our modeling efforts. This course gives a detailed and modern overview of statistical learning with a specific focus on high-dimensional data.

In this course we emphasize the tools that are useful in solving and interpreting modern-day analysis problems. Many of these tools are essential building blocks that are often encountered in statistical learning. We also consider the state-of-the-art in handling machine learning problems. We will not only discuss the theoretical underpinnings of supervised learning, but focus also on the skills and experience to rapidly apply these techniques to new problems.

During this course, participants will actively learn how to apply the main statistical methods in data analysis and how to use machine learning algorithms and visualization techniques, especially on high-dimensional data problems. The course has a strongly practical, hands-on focus: rather than focusing on the mathematics and background of the discussed techniques, you will gain hands-on experience in using them on real data during the course and interpreting the results.

Course Objectives

At the end of this course, students are able to apply and interpret the theories, principles, methods and techniques related to contemporary data science and understand and explain different approaches to data analysis:

  • Apply data visualization and dimension reduction techniques on high dimensional data sets.
  • Implement, understand, and explain methods and techniques that are associated with advanced data modeling, including regularized regression, principal components, correspondence analysis, neural networks, clustering, time series, text mining and deep learning.
  • Evaluate the performance of these techniques with appropriate performance measures.
  • Select appropriate techniques to solve specific data science problems.
  • Motivate and explain the choice for techniques to investigate data problems.
  • Interpret and evaluate the results of (high-dimensional) data analyses and explain these techniques in * Simple terminology to a broad audience.
  • Understand and explain the principles of high-dimensional data analysis and visualization.
  • Construct appropriate visualizations for each data analysis technique in R.