class Kmeans(
points: IndexedSeq[Point],
distance: DistanceFunction,
minChangeInDispersion: Double = 0.0001,
maxIterations: Int = 100,
fixedSeedForRandom: Boolean = false
)
距离函数是一个具有特征的对象(https://github.com/scalanlp/nak/blob/ae8fc0c534ea0613300e8c53487afe099327977a/src/main/scala/nak/cluster/Points.scala):
trait DistanceFunction extends ((Point, Point) => Double)
/**
* A companion object to the DistanceFunction trait that helps select the
* DistanceFunction corresponding to each string description.
*/
object DistanceFunction {
def apply(description: String) = description match {
case "c" | "cosine" => CosineDistance
case "m" | "manhattan" => ManhattanDistance
case "e" | "euclidean" => EuclideanDistance
case _ => throw new MatchError("Invalid distance function: " + description)
}
}
/**
* Compute the cosine distance between two points. Note that it is a distance
* because we subtract the cosine similarity from one.
*/
object CosineDistance extends DistanceFunction {
def apply(x: Point, y: Point) = 1 - x.dotProduct(y) / (x.norm * y.norm)
}
/**
* Compute the Manhattan (city-block) distance between two points.
*/
object ManhattanDistance extends DistanceFunction {
def apply(x: Point, y: Point) = (x - y).abs.sum
}
/**
* Compute the Euclidean distance between two points.
*/
object EuclideanDistance extends DistanceFunction {
def apply(x: Point, y: Point) = (x - y).norm
}
到目前为止,这是我的构造函数实现:
val p1 = new Point(IndexedSeq(0.0, 0.0 , 3.0));
val p2 = new Point(IndexedSeq(0.0, 0.0 , 3.0));
val p3 = new Point(IndexedSeq(0.0, 0.0 , 3.0));
val clusters1 = IndexedSeq( p1 , p2 , p3 )
val k = new Kmeans(clusters1 , ??????
如何创建 DistanceFunction 实现以实现 Kmeans 构造函数?我可以只使用现有的对象 DistanceFunction 吗?