python - 请求示例：用于预测序列中下一个值的循环神经网络

Question

谁能给我一个（pybrain）python中递归神经网络的实际示例，以预测序列的下一个值？（我已经阅读了 pybrain 文档，我认为没有明确的例子。）我也发现了这个问题。但是我看不到它在更一般的情况下是如何工作的。因此，我问这里是否有人可以提出一个清晰的例子，说明如何使用循环神经网络预测 pybrain 中序列的下一个值。

举个例子。

例如，我们有一个范围 [1,7] 内的数字序列。

First run (So first example): 1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6

Second run (So second example): 1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6

Third run (So third example): 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7

and so on.

现在给出例如一个新序列的开始：1 3 5 7 2 4 6 7 1 3

下一个值是什么

这个问题可能看起来很懒惰，但我认为没有一个好的和体面的例子来说明如何用 pybrain 做到这一点。

另外：如果存在超过 1 个功能，如何做到这一点：

例子：

例如，我们在 [1,7] 范围内有几个序列（每个序列有 2 个特征）。

First run (So first example): feature1: 1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6
                              feature2: 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7


Second run (So second example): feature1: 1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6
                                feature2: 1 2 3 7 2 3 4 6 2 3 5 6 7 2 4 7 1 3 3 5 6    

Third run (So third example): feature1: 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7
                              feature2: 1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6

and so on.

现在给出例如一个新序列的开始：

                                            feature 1: 1 3 5 7 2 4 6 7 1 3

                                            feature 2: 1 2 3 7 2 3 4 6 2 4

下一个值是什么

随意使用您自己的示例，只要它与这些示例相似并且有一些深入的解释。

score 10 · Accepted Answer

Issam Laradji 帮助我预测序列序列，但我的 pybrain 版本需要 UnserpervisedDataSet 对象的元组：

from pybrain.tools.shortcuts import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.datasets import SupervisedDataSet,UnsupervisedDataSet
from pybrain.structure import LinearLayer
ds = SupervisedDataSet(21, 21)
ds.addSample(map(int,'1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6'.split()),map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split()))
ds.addSample(map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split()),map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split()))
net = buildNetwork(21, 20, 21, outclass=LinearLayer,bias=True, recurrent=True)
trainer = BackpropTrainer(net, ds)
trainer.trainEpochs(100)
ts = UnsupervisedDataSet(21,)
ts.addSample(map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split()))
[ int(round(i)) for i in net.activateOnDataset(ts)[0]]

给出：

=> [1, 2, 5, 6, 2, 4, 5, 6, 1, 2, 5, 6, 7, 1, 4, 6, 1, 2, 2, 3, 6]

要预测较小的序列，只需将其训练为子序列或重叠序列（此处显示重叠）：

from pybrain.tools.shortcuts import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.datasets import SupervisedDataSet,UnsupervisedDataSet
from pybrain.structure import LinearLayer
ds = SupervisedDataSet(10, 11)
z = map(int,'1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6 1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split())
obsLen = 10
predLen = 11
for i in xrange(len(z)):
  if i+(obsLen-1)+predLen < len(z):
    ds.addSample([z[d] for d in range(i,i+obsLen)],[z[d] for d in range(i+1,i+1+predLen)])

net = buildNetwork(10, 20, 11, outclass=LinearLayer,bias=True, recurrent=True)
trainer = BackpropTrainer(net, ds)
trainer.trainEpochs(100)
ts = UnsupervisedDataSet(10,)
ts.addSample(map(int,'1 3 5 7 2 4 6 7 1 3'.split()))
[ int(round(i)) for i in net.activateOnDataset(ts)[0]]

给出：

=> [3, 5, 6, 2, 4, 5, 6, 1, 2, 5, 6]

不太好...

score 4 · Accepted Answer

这些步骤旨在执行您在问题第一部分中要求的内容。

1) 创建一个监督数据集，在其参数中期望样本和目标，

 ds = SupervisedDataSet(21, 21)
 #add samples (this can be done automatically)
 ds.addSample(map(int,'1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6'.split()),map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split()))
 ds.addSample(map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split()),map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split()))

y后续样本是其前任的目标或标签x。我们输入数字21是因为每个样本都有21数字或特征。

请注意，对于问题后半部分的标准符号，最好将 feature1 和 feature2 称为序列的 sample1 和 sample2，并让 features 表示样本中的数字。

2) 创建网络，初始化训练器并运行 100 个 epoch

net = buildNetwork(21, 20, 21, outclass=LinearLayer,bias=True, recurrent=True)
trainer = BackpropTrainer(net, ds)
trainer.trainEpochs(100)

确保将recurrent参数设置为True

3) 创建测试数据

ts = UnsupervisedDataSet(21, 21)
#add the sample to be predicted
ts.addSample(map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split()))

由于假设我们没有标签或目标，我们创建了一个无监督数据集。

4）使用训练好的网络预测测试样本

net.activateOnDataset(ts)

这应该显示预期的值fourth run。

对于第二种情况，序列可以有多个样本，而不是创建一个有监督的数据集，而是创建一个连续的数据集ds = SequentialDataSet(21,21)。然后，每次你得到一个新序列时，调用ds.newSequence()并添加样本——你称之为特征——使用ds.addSample().

希望这是明确的:)

如果您希望获得完整的代码以省去导入库的麻烦，请告诉我。

python - 请求示例：用于预测序列中下一个值的循环神经网络

2 回答 2

Related

Reference