0

I am making a convolutional network model with which I want to classify EEG data. The data is an experiment where participants are evoked with images of 3 different classes with 2 subclasses each. To give a brief explanation about the dataset size, a subclass has ±300 epochs of a given participant (this applies for all the subclasses).

  1. Object
  2. Color
  3. Number

Now my question is: I have 5 participants in my training dataset, I took 15% of each participants' data and put it in the testing dataset. Can I consider the 15% as unseen data even though the same participant was used to train the model on?

Any input is welcome!

4

1 回答 1

2

这取决于你想测试什么。测试集用于估计泛化(即在未见数据上的性能)。所以问题是:

  • 是否想估计对来自相同参与者(其数据用于训练分类器)的未见数据的泛化?
  • 或者你想估计对看不见的参与者(一般人群)的概括?

这实际上取决于您的目标或您要提出的主张。我可以考虑两种方法的情况:

  • 想想需要为每个用户重新培训的 BCI。在这里,您将测试来自同一个人的数据。
  • 另一方面,如果您提出一个非常笼统的主张(例如,我可以解码来自人群中某个大脑区域的一些相关信号),那么拥有一个由未包含在训练集中的参与者组成的测试集将提供更强有力的支持对您的索赔。(问题是这是否有效。)
于 2018-11-02T08:12:05.180 回答