CSC 591: Foundations of Data Science
Graduate Course for Data Science Track, NCSU, Computer Science Department, 2015
Course Description: Students will learn core data science principles related to statistical data analysis. This course introduces ideas in statistical learning and will help students prepare for advanced courses in data mining and machine learning. Focus will also be given on applying these principles for variety of data analysis tasks using R. Topics: Random variables and probability distributions, exploratory data analysis, variable selection, sampling methods, histograms and probability distributions, density estimation, missing data and imputation, mixture models, latent variables, and expectation maximization, regression analysis, discriminant analysis, bagging and boosting, principle component analysis, information theory – entropy, mutual information, Bayesian information criteria, conditional independence, rescaling and low-dimensional summaries, factor analysis, graphical causal models and causal inference, and evaluating predictive models.