Theory

We produce a lot of data (see: data capitalism)

  • Data mining: automatically extract useful knowledge from large datasets
  • Machine learning: automatically detect patterns in data and use these to make predictions or decisions
    • Typically, AI ML Deep Learning
  • Typically, data mining is more human-in-the-loop and more application specific whereas machine learning is more hands-off and general
  • Both similar to statistics but more emphasis on larger datasets, predictions instead of descriptions, and more general models

Healthy skepticism is good though:

“The combination of some data and an aching desire for an answer does not ensure that reasonable answer can be extracted from a given body of data”

  • John Tukey

Main topics:

Related background: