Data mining

Last updated Sep 9, 2022

Data mining is a way to generate new information by combining facts found in multiple transactions, and it can also be a way to predict future events.

Typical steps of data mining

  1. Learn about the application
  2. Identify data mining task
  3. Collect data
  4. Clean and preprocess the data
  5. Transform data or select useful subsets
  6. Choose data mining algorithm
  7. Data mining
  8. Evaluate visualize and interpret results
  9. Use results for profit or other goals

Coupon collector problem: you generally need to see $O(n \log n)$ samples to see all n possible values which have equal probabilities

# Predictive Policing

The use of data mining to deploy police officers to areas where crimes are more likely to occur. It is based on the observation that individual criminals act in a predictable way.