1)很容易:
want <- c("storesize", "sales_per_sqft", "sales_per_visits", "tothhsinta")
Kmeans(stores_standard[, want], 20, iter.max = 1000, nstart = 1,
method = c("euclidean"))
对于 2)
## a 2-dimensional example from ?Kmeans
x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")
cl <- Kmeans(x, 2)
现在看看cl
:
R> str(cl)
List of 4
$ cluster : int [1:100] 2 2 2 2 2 2 2 2 2 2 ...
$ centers : num [1:2, 1:2] 1.0245 -0.017 1.0346 0.0375
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:2] "1" "2"
.. ..$ : chr [1:2] "x" "y"
$ withinss: num [1:2] 0.00847 0.22549
$ size : int [1:2] 50 50
- attr(*, "class")= chr "kmeans"
列表的cluster
组成部分包含分配的集群 ID。这些与输入数据中的样本顺序相同。如果您想将cluster
组件分配为输入数据中的列,我们将执行以下操作:
R> x <- cbind(x, Cluster = cl$cluster)
R> head(x)
x y Cluster
[1,] -0.24251497 0.532012889 2
[2,] 0.10957740 0.225168920 2
[3,] -0.35563544 -0.428798979 2
[4,] -0.41251306 0.529953489 2
[5,] -0.61212001 -0.003443993 2
[6,] 0.04435213 0.086595025 2
对于您的数据,请执行以下操作:
stores_standard <- cbind(stores_standard, Cluster = kmeans_object$cluster)
至于 3,这kmeans()
在标准 R 和Kmeans()
包amap中都不可能出现。