0

我正在使用 SMOTE 机制来过滤我的数据集,但过滤操作往往会失败。

我在 doSMOTE 函数中找到了我的案例的路径原因,其中:

1-计算实例之间的距离。

2-使用比较器根据这些距离对实例进行排序:

public int compare(Object o1, Object o2) {
      double distance1 = (Double) ((Object[]) o1)[0];
      double distance2 = (Double) ((Object[]) o2)[0];
      return (int) Math.ceil(distance1 - distance2);
} 

但是在这个阶段——Java 的比较规则被打破并抛出异常。

我怀疑原因是我的实例彼此太接近了。远处的一瞥附在帖子的末尾。

我的问题是:

1-这是一个合理的案例吗?

2-有没有办法绕过它?

3- 如果 SMOTE 在这种情况下没有用 - 您可以推荐哪些其他过滤器?

距离采样:

0.0000000000000000000

0.0012141773193000000

0.0038432461240999900

0.0061871080511999900

0.0100299787545999000

0.0104868096109699000

0.0105987645799099000

0.0108892893852699000

0.0117478589556099000

1.0309228276616200000

1.0310198235697600000

1.0313107565587700000

2.1496389158514700000

2.1507375480523100000

3.0822389928979700000

3.0824063362008500000

3.0827550748437000000

3.1315505239392400000

4.0849290781932300000

4.0849749023536100000

5.0827069584694600000

5.0827154979640900000

5.0827562565688700000

6.0680583877232500000

6.0680629044326200000

6.0680841744788300000

6.0681194562755100000

6.0681666719043900000

7.0640507924313300000

7.0640864288327500000

99983.1268106843000000000

99983.1287314636000000000

99983.1306576871000000000

99983.1325893850000000000

99983.1345265875000000000

99983.1454175467000000000

99983.1475548918000000000

99983.1496988369000000000

99983.1518494214000000000

99983.1540066853000000000

99983.1561706687000000000

99983.1583414124000000000

99983.1605189572000000000

99983.1627033444000000000

99983.1692979800000000000

99983.1715101578000000000

99983.1737293904000000000

99983.1759557214000000000

99983.1781891948000000000

99983.1804298551000000000

99983.2325590018000000000

99983.2784693506000000000

99984.1164113154000000000

99984.1167578005000000000

99984.1290293883000000000

99984.1405635856000000000

99984.1514150653000000000

99984.1616332310000000000

99984.1987066124000000000

99984.2049288990000000000

99984.6421596405000000000

99985.0506858703000000000

99985.1065026751000000000

99985.7425293353000000000

99985.7456043256000000000

99985.7486938850000000000

99985.8799957050000000000

99985.8918001021000000000

99986.0036067922000000000

99986.0163781578000000000

99986.0284093637000000000

99986.0362028056000000000

99986.0397551119000000000

99986.0504648354000000000

99986.5805672649000000000

99986.5908405239000000000

99986.6006006520000000000

99986.8206430289000000000

99986.8239828836000000000

99986.8273411574000000000

99986.8307180474000000000

99986.8336975245000000000

99986.8341137537000000000

99986.9395424908000000000

99986.9570787376000000000

99986.9729798986000000000

99987.1063584039000000000

99987.2804998215000000000

99987.2814803568000000000

99987.2824628995000000000

99987.2834474572000000000

99987.2844340383000000000

99987.2854226507000000000

99987.2864133025000000000

99987.2874060019000000000

99987.2884007571000000000

99987.3135877017000000000

4

1 回答 1

1

Weka 3.7.x 的 SMOTE 包中已修复此问题。

该修复程序也适用于 Weka 3.6。使用 weka 3.7 版时 - 可以通过包管理器进行更新。

了解更多信息:

SMOTE 更新

于 2013-04-04T07:12:44.973 回答