Course Descriptions

Students in the program complete 33.5 credits, which include 30 credits of coursework, a 2-credit capstone project and a 1.5-credit immersion experience that will take place at SMU. Students can earn the Master of Science in Data Science in 20-28 months.

Both full-time and part-time options are available. Read more about sample course schedules for full-time and part-time students.

Core Coursework

Statistical Foundations for Data Science

3 CREDITS

Students in this course receive an overview of statistical methods from an experimental design perspective. Students will review statistical sampling, T-tests, Analysis of Variance, Linear Regression and other skills. Rather than calculations, the course focuses on interpretation, analysis and communication of the results and ethics of statistical analysis.

LEARN

Experimental Design, Statistical Sampling, T-tests, Analysis of Variance, Linear Regression, Diagnostics and Checks for Statistical Methods, Interpretation and Communication of Results (both oral and written), Ethics of Statistical Analysis

APPLY

SAS, R

Doing Data Science

3 CREDITS

A project-based course that brings together methods, concepts and current practices in the growing field of data science, including statistical inference, financial modeling, data visualization, social networks and data engineering. Emphasis is on the ethical dilemmas involved in gathering, storing, analyzing and disseminating information from large databases.

LEARN

Statistical Thinking in the Age of Big Data,Exploratory Data Analysis (EDA), Kernel Density Estimation, Advanced Regression, Social Networks and Data Journalism, Financial Modeling, Reproducible Research and Sharing Your Work, Ethics and Privacy

APPLY

R Studio and Shiny, Python, GitHub

Applied Statistics: Inference and Modeling

3 CREDITS

This course builds on Statistical Foundations for Data Science with attention to the analysis of multivariate data. Basic machine learning methods, such as linear discriminant analysis, logistic regression and principal components analysis, are discussed. Emphasis is on interpretation of the analysis rather than calculations.

LEARN

Multiple Linear Regression and Variable Selection, Multivariate Analysis of Variance (MANOVA), Linear and Quadratic Discriminant Analysis, Unsupervised Learning (Clustering), Methods for Categorical Variables (Explanatory and Response), Autoregressive Models for Time Series Data, Basic Bootstrap

APPLY

SAS, R

File Organization & Database Management

3 CREDITS

This course surveys current database approaches and systems, as well as the principles of design and the use of these systems. Students learn database query language design and implementation constraints as well as applications of large databases, including a survey of file structures and access techniques, such as NoSQL databases. Students will use a relational database management system to implement a database design project.

LEARN

Database Queries, Relational Database Design, NoSQL Database

APPLY

SQL, MySQL, MongoDB, XML, Python

Data Mining

3 CREDITS

This course introduces the processes of managing, exploring, visualizing and acting on large amounts of data. This course provides an introduction to data mining techniques such as classification, regression, association rules, cluster analysis and recommendation systems. All material covered is reinforced through hands-on experience using state-of-the art tools to design and execute data mining processes. Class examples come from Python and R.

LEARN

Machine learning, Association Mining, Cluster Analysis, Recommender Systems

APPLY

iPython, Scikit-learn, Pandas, Graphlab-create, R, Rpy

Statistical Sampling

3 CREDITS

This course covers principles of planning and conducting surveys, including both probability and nonprobability sample design and analysis, sample size determination and how to use auxiliary sources of data external to the sample to improve estimation. Methods for using information from both samples and “found” (big) data together are discussed.

LEARN

Probability Sampling, Complex Sample Designs, Analysis of Survey Data

APPLY

SAS

Visualization of Information

3 CREDITS

This course introduces data visualization and creative coding using the Processing programming language. Students explore visual and information design principles through code examples. Class activities incorporate 2-D and 3-D computer graphics, interactivity and data input. Procedural and object-oriented programming approaches to data visualization will be covered, as well as an overview of leading-edge data visualization libraries and APIs, including web-based approaches.

LEARN

Data Visualization, Creative Coding, Visual and Information Design, Programming

APPLY

Processing

Quantifying the World

3 CREDITS

This course introduces the sequence of steps needed to carry out Internet-scale data analytics, from hypothesis formation and data collection to methods of analysis and visualization. Students will become proficient in data collection and storage strategies that can be used in later analysis. Script-based programming techniques are used to automate collection from a variety of third-party resources, such as application programming interfaces (APIs). Methodologies for constructing representative samples, storing raw data, merging disparate data sets, cleaning inconsistent entries and constructing derivative data sets are reviewed. Students are introduced to two classes of basic analysis of gathered data – descriptive statistics and data visualization – which are used to validate and improve the accuracy of gathered data sets, a prerequisite to more advanced analysis.

LEARN

Data Wrangling, Accessing APIs, Data Collection Design and Implementation, Synthesize Concepts in a Capstone Project

APPLY

Python, pandas, R, SQL, NoSQL

Electives

Cloud Computing

3 CREDITS

This course introduces students to the growing assortment of cloud computing technologies, with an emphasis on fundamental cloud topics such as virtualization, IaaS, PaaS, and DevOps. The course is intended to be hands-on, with students working with current technologies that make the cloud possible. They learn top cloud service providers, the “as a service” deployment model, and selective big data tools. Students will also get a high-level overview of NoSQL, and big data topics such as Hadoop, MapReduce, Pig, Hive, and Spark.

APPLY

Python, pandas, R, SQL, NoSQL

Machine Learning

3 CREDITS

This class introduces machine learning and the data preparation workflow. The machine learning tasks covered include multivariate non-linear non-parametric regression, supervised classification, unsupervised classification and deep learning. For these machine learning tasks, it is shown how to assess the quality of the machine learning models and perform error estimation and feature engineering. All material covered is reinforced through hands-on experiences using state-of-the art tools to design and execute data mining processes. Class examples come from Matllab, Python and R.

LEARN

Multivariate Non-Linear Non-Parametric Regression, Supervised Classification, Unsupervised Classification, Deep Learning

APPLY

Matlab, Python, R

Visualization of Information and Creative Coding II

3 CREDITS

This course extends the study of creative coding and visualization across numerous languages, specialized libraries/APIs and distribution platforms. Students build data-driven visualizations in Java, C++ and JavaScript, including the P5.js, three.js and d3.js libraries.

Data and Network Security

3 CREDITS

This course focuses on the fundamental concepts, mechanisms and protocols for data and network security. Symmetric key cryptography is the foundation of many security and authentication protocols; their use, security and vulnerabilities are discussed in detail. Students learn public key cryptography; algorithms such as AES, DES and hash algorithms; and protocols built upon and applications that use those fundamental building blocks, such as message authentication, digital signatures and digital certificates. Students also learn about network security principles, access control and user authentication, privacy and the ethics of security.

LEARN

Ciphers, Hash Algorithms, Secure Communication Protocols

APPLY

AES, DES, SHA, Python

On-Campus Immersion

Immersion

1.5 CREDITS

The immersion is designed to offer additional learning, networking and relationship-building opportunities. Taking place on the SMU campus in Texas, the immersion is a 3- to 4-day experience. Students will attend a conference and have the chance to meet in person with classmates, faculty, industry leaders and employers for collaborative, hands-on workshops, panels, lectures and informational sessions. Students are required to attend two immersions during their time in the program. Hotel accommodations are included.

Capstone Project

Capstone 1A

1 CREDIT

Students will spend the first of two consecutive full terms working on a collaborative group project. During the first of the two terms, students will begin their work on the project and are expected to complete at least half the project by the end of the term. Students will develop and work on their projects under faculty supervision.

Capstone 1B

1 CREDIT

Students will spend the second full term working on their collaborative group project from Capstone 1A. Students will then be required to present their completed projects during the on-campus immersion, typically held near week 11 of the term.

Request Information