Professor: Dr. Robert J. Brunner
Teaching Assistant: Edward Kim (Ph.D. candidate Physics)
This class is a synchronous, face-to-face course. This course will build a practical foundation for data science by teaching students basic tools and techniques that can scale to large computational systems and massive data sets. This course integrates with ACCY 570, Data Analytics Foundations for Accountancy. Be sure to follow the syllabi for both courses to stay abreast of the relevant material.
Students will learn about the basic tasks in statistical and machine learning, including the importance of data preparation. Next, linear regression is introduced along with concepts like regularization and an extension to logistic regression. Supervised learning is introduced with examples for both classification and regression presented including naive Bayes, k-nn, SVM, decision trees, and ensemble techniques. Unsupervised techniques are presented with applications in both clustering and dimensional reduction. Specific application areas are explored for these learning techniques, including text analysis, network analysis, and social media analysis. The last part of the course focuses on cloud computing technologies, including Hadoop, MapReduce, NoSQL data stores, Spark, and streaming data analysis. The course concludes with a brief introduction of deep learning.
Students must have access to a fairly modern computer, with a modern web browser to work with the course websites and course JupyterHub server.
This class is restricted to MAS students pursuing the data analytics concentration in Accountancy program.
Please refer to the course syllabus for more information about course content and grading policies.