Bài 8 Slide Unsupervised Learning: K‐Means Gaussian Mixture Models. Unsupervised Learning K ‐Means Gaussian Mixture Models Unsupervised Learning K ‐Means Gaussian Mixture Models Unsupervised Learning Supervised learning used labeled data pairs (x, y) to learn a.
Trang 1Unsupervised Learning:
K- ‐Means & Gaussian Mixture Models
Trang 2Unsupervised Learning
• Supervised learning used labeled data pairs (x, y)
to learn a function f : X→Y
– But, what if we don’t have labels?
• No labels = unsupervised learning
• Only some points are labeled = semi- supervised ‐ learning
– Labels may be expensive to obtain, so we only get a few
knowledge discovery.
Trang 3K- ‐Means Clustering
Trang 4Clustering Data
Trang 5K- ‐Means Clustering
K- ‐Means ( k , X )
• Randomly choose k cluster center locations
(centroids)
• Loop until convergence
• Assign each point to the cluster of the closest centroid
• Re- estimate ‐ the cluster centroids based on the data assigned to each cluster
Trang 6K- ‐Means Clustering
K- ‐Means ( k , X )
• Randomly choose k cluster center locations (centroids)
• Loop until convergence
• Assign each point to the cluster of the closest centroid
• Re- estimate ‐ the cluster centroids based on the data assigned to each cluster
Trang 7K- ‐Means Clustering
K- ‐Means ( k , X )
• Randomly choose k cluster center locations (centroids)
• Loop until convergence
• Assign each point to the cluster of the closest centroid
• Re- estimate ‐ the cluster centroids based on the data assigned to each cluster
Trang 8K- ‐Means Animation
Example generated by Andrew Moore using Dan Pelleg’s super- duper fast K-means system:
Dan Pelleg and Andrew Moore Accelerating Exact k-means Algorithms with Geometric Reasoning.
Proc Conference on Knowledge Discovery in Databases 1999.
Trang 9K- ‐Means Objective Function
• K- ‐means fnds a local optimum of the following objective function:
Trang 10Problems with K- Means ‐
– Do many runs of K- Means, ‐ each with different initial centroids
– Seed the centroids using a better method than randomly choosing the centroids
• e.g., Farthest- frst ‐ sampling
• Must manually choose k
– Learn the optimal k for the clustering
• Note that this requires a performance measure
Trang 11• How do you tell it which clustering you want?
Problems with K- Means ‐
k = 2
Constrained clustering techniques (semi- supervised) ‐
Same- ‐cluster constraint (must- link) ‐ Different- cluster ‐ constraint (cannot- link) ‐
Trang 12Gaussian Mixture Models
• Recall the Gaussian distribution:
Trang 14• Each component generates data from a
Gaussian with mean µi and covariance matrix
σ2I
Assume that each datapoint is generated
according to the following recipe:
µ1
µ2
µ3
Trang 15• Each component generates data from a
Gaussian with mean µi and covariance matrix
σ2I
Assume that each datapoint is generated
according to the following recipe:
1 Pick a component at random Choose
component i with probability P(ωi).
µ2
Trang 16• Each component generates data from a
Gaussian with mean µi and covariance matrix
σ2I
Assume that each datapoint is generated
according to the following recipe:
1. Pick a component at random Choose
component i with probability P(ωi).
2. Datapoint ~ N(µi, σ2I )
µ2
x
Trang 17The General GMM assumption
• Each component generates data from a
Gaussian with mean µi and covariance matrix Σi
Assume that each datapoint is generated
according to the following recipe:
1. Pick a component at random Choose
component i with probability P(ωi).
2. Datapoint ~ N(µi , Σi )
Trang 18Fitting a Gaussian Mixture Model
(Optional)
Trang 19Just evaluate a Gaussian at xk
Expectation-Maximization for GMMs
Iterate until convergence:
On the t’th iteration let our estimates be
λt = { µ1(t), µ2(t) … µc(t) }E-step: Compute “expected” classes of all datapoints for each class
Trang 20E.M for General GMMs
Iterate On the t’th iteration let our estimates be
M-step: Estimate µ, Σ given our data’s class membership distributions
pi(t) is shorthand for estimate of P( ωi) on t’th iteration
k i
Trang 21(End optional section)
Trang 23After first iteration
Trang 24After 2nd iteration
Trang 25After 3rd iteration
Trang 26After 4th iteration
Trang 27After 5th iteration
Trang 28After 6th iteration
Trang 29After 20th iteration
Trang 30Some Bio Assay data
Trang 31clustering of the assay data
Trang 32Resulting Density Estimator