During the honeymoon phase, the first MOOCs with Andrew Ng repeating “concretely” 10 times in a 6 minutes video, machine learning seems pretty easy and intuitive. There are plenty of Medium articles or tutorials that we can read quickly, even in the Parisian metro, and understand what is explained.
But sooner or later, during an interview or with coworkers, we come to realize that there is far more to data science than just reading blog articles or following well-designed MOOCs. Proper code versioning, clean code habits, advanced machine learning libraries and algorithms,
- On linear algebra
- On Constrained and Unconstrained optimization
- On Ensemble Learning
- On Decision Trees
- On Logistic Regression, briefly
- On K-Means and EM algorithm
- A probability cheatsheet
- On Naive Bayes
- LDA and QDA
- On Time Series
- Even though the new state of the art algorithm ( BERT/XLnet) are not included, I still share these NLP notes from CS224n
- Wikistat.fr French only, but really worth it! If I had to choose only one site, I would certainly choose this one.
These are some books that I spent time reading. They require much more effort than the above shared pdfs.
- MLAPP, Machine Learning a Probabilistic Perspective by Kevin Murphy
- ESL, Elements of Statistical Learning, by Hastie, Tibshirani, and Freidman
- Deep Learning, by Courville, Goodfellow, and Bengio
While Elements of Statistical Learning was the first I read, I found it too verbose in some parts. Chapters 1 to 4 were really worth my time though (these notes helped me a lot!).
Machine Learning a Probabilistic Perspective is my favorite, it is more concise and tries (and fails sometimes) to go straight to the point in every chapter. It is easy to miss some steps in the equations sometimes, but it is part of the learning process haha!
Finally, I read the Deep Learning book “just for fun”. As Deep Learning is more experimental than theoretical, with a lot of trial and error, I did not want to spend too much time trying to understand the theory. Understanding the main architectures(MLP,
Some other books or pdfs that I found interesting :
- Python Web Scraping cookbook
- Expert Python Programming (I still have a lot to learn from that book)
- Hands-On Machine Learning with Scikit-Learn and TensorFlow
- Feature Engineering made easy
- Even better, Feature Engineering and
Selection :a practical approach for predictive models, it has not been released yet but an online version is available.