Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
使用 apache mahout 创建文档向量和集群相当容易。执行 clusterdump 允许用户查看与各个集群相关的术语。但是,如何识别属于每个集群的文档?
谢谢
我认为,对于每个文档,找到它的向量与每个集群中心的欧几里得距离,并将其分配给最近的集群。