-
Notifications
You must be signed in to change notification settings - Fork 16
/
Copy pathparams.json
1 lines (1 loc) · 4.22 KB
/
params.json
1
{"name":"NYU Center for Data Science: DS-GA 1003","tagline":"Machine Learning and Computational Statistics (Spring 2016)","body":"### Hello\r\nI'm [David Rosenberg](https://www.linkedin.com/in/david-rosenberg-5982414), and I'll be the instructor for DS-GA 1003 this year. This is a temporary page to give you some information before the course begins (January 27, 2015). This year's class will be very similar to [last year's class](https://davidrosenberg.github.io/ml2015/), though some of the more advanced topics may change. \r\n\r\n### Prerequisites\r\n* **Programming**: You'll need to be comfortable with Python and [NumPy](http://www.numpy.org/), at least. Knowing how to generate plots and tables to summarize results will be necessary as well. You should take a look at the [first homework assignment from last year](https://davidrosenberg.github.io/ml2015/homework/hw1.pdf) to see what you'll be jumping into. For a crash course in Python for data science, the first few chapters of [Data Science from Scratch](http://amzn.com/149190142X) seem like a good bet. For more in depth coverage, including the [pandas library](http://pandas.pydata.org/), [Python for Data Analysis](http://amzn.com/1449319793) is worth a look. While pandas won't be particularly useful for the homework assignments, it's highly recommended for doing basic data analysis in practice and may be useful for your course projects. (If you're an R user, I highly recommend the [data.table](https://github.com/Rdatatable/data.table/wiki) package.)\r\n* **Math**: Your best bet would be to review the notes from the prerequisite class [DS-GA 1002](http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall15/notes.html). The priorities would be a review of matrix algebra (be comfortable with reading and manipulating expressions involving matrices, vectors, and norms), gradients, and basic probability theory (expectations, independence, Law of Large Numbers, conditional distributions, and conditional expectations). But I consider pretty much every part of the 1002 syllabus to be an important part of a data scientist's toolbox. \r\n* **Machine Learning**: You should at least be familiar with the basic notions of supervised learning, overfitting, training set, test set, and cross-validation. The course is designed to follow [DS-GA 1001](http://cds.nyu.edu/ds-ga-introduction-data-science-fall-2015/), and many of your peers will have taken this class. The book [Data Science for Business](http://www.amazon.com/Data-Science-Business-data-analytic-thinking/dp/1449361323) is highly recommended. It covers many important issues in practical data science that we don't have time for in this course. You might also consider working through [Andrew Ng's Machine Learning course](https://www.coursera.org/learn/machine-learning) on Coursera. Essentially any introduction to machine learning would be sufficient.\r\n\r\n### Textbooks\r\nWe won't have a main textbook for the class this year, and we'll focus on books that are free online. In addition to the books discussed on last year's website, I'll add the following recommendations:\r\n* David Barber's [Bayesian Reasoning and Machine Learning](http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=Brml.HomePage), available free online.\r\n* Christopher Bishop's [Pattern Recognition and Machine Learning](http://www.amazon.com/Pattern-Recognition-Learning-Information-Statistics/dp/0387310738), which is not available free online, though I have found it to be nice for several topics. \r\n\r\n### General Advice \r\n* This year's course lectures and homeworks will be similar to last year's. If you want to hit the ground running, it would not be a waste of time to start working on last year's homework assignments now.\r\n* The homework writeups must all be submitted as PDF files. There are many ways to do this, but it might be worth your time to figure out a way you're comfortable with before the class starts. For parts of the homework that are math heavy, you may want to use [LyX](http://www.lyx.org/) or write directly in [LaTeX](https://www.latex-project.org/). You can also write math directly in iPython, which can be convenient as well. ","google":"UA-64247420-2","note":"Don't delete this file! It's used internally to help with page regeneration."}