High-Performance Computing and Data Science

3 credits

In this course, students take a deep dive into the basics of HPC systems, job scheduling, parallelization of data analysis, and machine learning tasks. Students also learn how to effectively manage large datasets, define reproducible software environments, and execute parallelized workflows. 


Tools for data science, machine learning, data and computational workflows, software stacks, interpreted and compiled languages, software debugging, performance analysis, data file formats, parallel computation, and computational efficiency.


Unix shell scripting, Python, and C++