machine-learning - 无标签机器学习异常检测

Question

我在一段时间内跟踪多个信号并将它们与时间戳相关联，如下所示：

t0 1 10 2 0 1 0 ...
t1 1 10 2 0 1 0 ...
t2 3  0 9 7 1 1 ... // pressed a button to change the mode
t3 3  0 9 7 1 1 ...
t4 3  0 8 7 1 1 ... // pressed button to adjust a certain characterstic like temperature (signal 3)

其中 t0 是戳戳，1 是信号 1 的值，10 是信号 2 的值，依此类推。

在特定时间段内捕获的数据应视为正常情况。现在应该从正常情况中检测到重要的推导。通过显着推导，我并不是说一个信号值只是更改为在跟踪阶段未看到的值，而是指许多尚未相互关联的值更改。我不想硬编码规则，因为将来可能会添加或删除更多信号，并且可能会实现具有其他信号值的其他“modi”。

这可以通过某种机器学习算法来实现吗？如果发生一个小的推导，我希望算法首先将其视为对训练集的微小更改，如果它在未来多次发生，则应该“学习”。主要目标是检测更大的变化/异常。

我希望我能足够详细地解释我的问题。提前致谢。

score 1 · Accepted Answer

您可以只计算特征空间中的最近邻居，并设置一个阈值，它允许离您的测试点多远而不是异常。

假设您在“特定时间段”中有 100 个值

所以你使用 100 维的特征空间和你的训练数据（不包含异常）

If you get a new dataset you want to test, you calculate the (k) nearest neighbor(s) and calculate the (e.g. euclidean) distance in your featurespace.

If that distance is larger than a certain threshold it's an anomaly. What you have to do in order to optimize is finding a good k and a good threshold. E.g. by Grid-search.

(1) Note that something like this probably only works well if your data has a fixed starting and ending point. Otherwise you would need a huge amount of data and even than it will not perform as good.

(2) Note It should be worth trying to create an own detector for every "mode" you have mentioned in your question.

machine-learning - 无标签机器学习异常检测

1 回答 1

Related

Reference