Theory
We produce a lot of data (see: data capitalism)
- Data mining: automatically extract useful knowledge from large datasets
- Machine learning: automatically detect patterns in data and use these to make predictions or decisions
- Typically, AI ML Deep Learning
- Typically, data mining is more human-in-the-loop and more application specific whereas machine learning is more hands-off and general
- Both similar to statistics but more emphasis on larger datasets, predictions instead of descriptions, and more general models
Healthy skepticism is good though:
“The combination of some data and an aching desire for an answer does not ensure that reasonable answer can be extracted from a given body of data”
- John Tukey
Main topics:
- Exploratory Data Analysis
- Supervised learning
- No Free Lunch Theorem
- Unsupervised Learning
- Optimization
- Regularization
- Regression
- Binary classification
- MLE
- Latent-factor models
- Recommender System
- Neural Networks
- CNNs
- Autoencoders
- Deep learning Semantics
- Transformers
- Generative Models
- Philosophy
Related background: