0

I want to write the code that was proposed in the "VSUMM" article.

In one step there is k-means algorithm. It's written in the article but didn't describe how it'll work. According to the article there is a video at first. The video will be split into frames, then according to some formula the k will be computed and k_mean algorithm get started.

There is a data set of video frames which are images. How could I apply k-means on them?

What I've done till now:

  1. Put all images in k groups. Each group in a row of a cell array which the first data of it, is the mean value. this values are the names of images indeed.
  2. Calculate the Euclidean distance between each image and the other mean value and I put the images that has minimum distance to previous keys in that group.

But now I'm stuck at the 3rd step, and I don't know what to do.

I've got k group of images in cell array which values are just the name of images but according to k-means the new k is the mean of elements in a group. But these elements in my theory are just the name of images. So what should I do? What should this mean be? Is it correct if i get the mean of images' name?

4

1 回答 1

0

维基百科有一篇关于 k-means 聚类技术的好文章。在程序上,您确定一组可以表示为向量的项目——在您的情况下,我认为这些项目是图像帧,向量分量是帧中所有像素的值(千或百万维向量)。K-means 聚类发现 k 组图像帧,每组内相似,组间不相似。您决定什么是 k:5 或 10 或其他。

第一步:在百万维向量空间中随机定义k个点。听起来你是这样做的。

第二步:对于每个图像帧,使用欧几里得距离找出k个点中的哪一个最接近所讨论的帧。听起来你没有那样做,但我不知道。

第三步:对于k个组中的每一个,计算组中所有向量的均值。也就是说,对组中的项目进行向量求和,然后除以组中的成员数。对每个组分别执行此操作。

第四步:用你刚刚计算的平均值替换每个 k 点。

然后重复这个过程(步骤二到四)数百或数千次。它将逐渐接近一个解决方案,在该解决方案中,每个集群内的方差最小化,而集群之间的方差最大化。也就是说,每个组将表示在组内视觉上彼此相似并且在一组与另一组之间不同的图像。

于 2013-07-16T02:07:56.697 回答