2

我正在尝试在 Weka 中打开 Arff,但出现两个错误。

首先,文件未被识别为“Arff 数据文件”。原因:文件过早结束读取 Token[EOL],第 3267 行。

此外,如果我单击缺少值“?”的“使用转换器”,

第二个错误是,csvloader 加载失败。原因:值数错误,读取2,预期1,读取Token[EOF],3267

该文件是:

https://www.dropbox.com/s/xs0ssnvs42bik5c/sg.arff

4

1 回答 1

0

任何arff 文件都应该在值之间包含逗号,而您的文件没有。你确定这是一个有效的 arff 文件吗?

您的 arff 文件无效。你的属性是重复的。您只需声明一次。例如,如果您具有以下属性

set CLASSPATH=.;d:\tools\Weka-3-7\weka.jar
d:\atilla\downloads>java weka.core.Instances sg.arff
java.io.IOException: Unable to determine structure as arff (Reason: java.lang.Il
legalArgumentException: Attribute names are not unique! Causes: 'campus' 'friend' 'homework' 'people' 'people' 'do' 'work' 'work' 'study' 'campus' 'people' 'people' 'life' 'learn' 'study' 'learn' 'put' 'study' 'learn' 'institute' 'get' 'put

以下是由您的文件构建的有效 arff 文件。

@relation sg-test
@attribute campus real
@attribute utilitarian real
@attribute put real
@attribute much real
@attribute make real
@attribute look real
@attribute nice real
@attribute people real
@attribute busy real
@attribute have real
@attribute real real
@attribute friendship real
@attribute institute real
@attribute end real
@attribute pick real
@attribute homework real
@attribute friend real
@attribute lose real
@attribute way real
@attribute crushed real
@attribute lie real
@attribute say real
@attribute do real
@attribute work real
@attribute time real
@attribute type real
@attribute study real
@attribute room real
@attribute many real
@attribute great real
@attribute place real
@attribute go real
@attribute city real
@attribute dull real
@attribute Class {term,score}
@data 
0.0,0.041666666666666664,-0.019185326611942655,0.005523215037172114,0.0,0.012052341597796145,0.02062568512992925,0.0,-0.030000000000000006,0.708941605839416,0.0,0.12317518248175183,0.05020802460556254,-0.019147145462196667,0.125,0.0,0.0,-0.06617570128224504,0.0,0.10948905109489052,0.10948905109489052,0.0,-0.3490625485300618,0.00402808616500622,0.0,-0.125,0.0,-0.028925619834710748,0.006898734933282365,-0.019185326611942655,0.015740237951508994,0.015740237951508994,0.12091857471887278,0.0,term

当我执行相同的命令时。我从 Weka 获得以下信息。

Relation Name:  sg-test
Num Instances:  1
Num Attributes: 35

     Name                      Type  Nom  Int Real     Missing      Unique  Dist
   1 campus                     Num   0% 100%   0%     0 /  0%     1 /100%     1 
   2 utilitarian                Num   0%   0% 100%     0 /  0%     1 /100%     1 
   3 put                        Num   0%   0% 100%     0 /  0%     1 /100%     1 
   4 much                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
   5 make                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
   6 look                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
   7 nice                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
   8 people                     Num   0% 100%   0%     0 /  0%     1 /100%     1 
   9 busy                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  10 have                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  11 real                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
  12 friendship                 Num   0%   0% 100%     0 /  0%     1 /100%     1 
  13 institute                  Num   0%   0% 100%     0 /  0%     1 /100%     1 
  14 end                        Num   0%   0% 100%     0 /  0%     1 /100%     1 
  15 pick                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  16 homework                   Num   0% 100%   0%     0 /  0%     1 /100%     1 
  17 friend                     Num   0% 100%   0%     0 /  0%     1 /100%     1 
  18 lose                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  19 way                        Num   0% 100%   0%     0 /  0%     1 /100%     1 
  20 crushed                    Num   0%   0% 100%     0 /  0%     1 /100%     1 
  21 lie                        Num   0%   0% 100%     0 /  0%     1 /100%     1 
  22 say                        Num   0% 100%   0%     0 /  0%     1 /100%     1 
  23 do                         Num   0%   0% 100%     0 /  0%     1 /100%     1 
  24 work                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  25 time                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
  26 type                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  27 study                      Num   0% 100%   0%     0 /  0%     1 /100%     1 
  28 room                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  29 many                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  30 great                      Num   0%   0% 100%     0 /  0%     1 /100%     1 
  31 place                      Num   0%   0% 100%     0 /  0%     1 /100%     1 
  32 go                         Num   0%   0% 100%     0 /  0%     1 /100%     1 
  33 city                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
  34 dull                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
  35 Class                      Nom 100%   0%   0%     0 /  0%     1 /100%     1 
于 2014-04-03T08:23:56.713 回答