0

我正在尝试检查特征选择的属性,为此我应用了 information.gain 、 gain.ratio 和 chi-squared 但是一些属性给出了 NaN 或 0.0000000 的值。

> weights <- information.gain(Team1.Result~., df)
> print(weights)
               attr_importance
Sr..No.            0.000000000
Matchid            0.000000000
Team2              0.171564805
Margin             0.344871508
Toss               0.004552660
Bat                0.006355032
Ground             0.324758562
Date               0.674699370
Team1.BatRate      0.000000000
Team1.Bat_SR       0.000000000
Team1.BowlRate     0.144960767
Team1.Bowl_SR      0.000000000
Team2.BatRate      0.000000000
Team2.Bat_SR       0.000000000
Team2.BowlRate     0.161264860
Team2.Bowl_SR      0.161264860

增益比是

> weights <- gain.ratio(Team1.Result~., df)
> print(weights)
               attr_importance
Sr..No.                    NaN
Matchid                    NaN
Team2              0.075884914
Margin             0.107668123
Toss               0.006675310
Bat                0.009171368
Ground             0.133481349
Date               0.175239871
Team1.BatRate              NaN
Team1.Bat_SR               NaN
Team1.BowlRate     0.266415653
Team1.Bowl_SR              NaN
Team2.BatRate              NaN
Team2.Bat_SR               NaN
Team2.BowlRate     0.283865166
Team2.Bowl_SR      0.283865166

卡方给出

> res <- chi.squared(Team1.Result~., df)
> res
               attr_importance
Sr..No.              0.0000000
Matchid              0.0000000
Team2                0.5168656
Margin               0.7149496
Toss                 0.0951519
Bat                  0.1125653
Ground               0.7022298
Date                 1.0000000
Team1.BatRate        0.0000000
Team1.Bat_SR         0.0000000
Team1.BowlRate       0.4553474
Team1.Bowl_SR        0.0000000
Team2.BatRate        0.0000000
Team2.Bat_SR         0.0000000
Team2.BowlRate       0.4823412
Team2.Bowl_SR        0.4823412

一些显示数据的记录(我想添加图像,但网站不允许我添加图像)

   Sr. No.  Matchid Team2   Margin  BR  Toss    Bat Ground  Date    Team1.BatRate   Team1.Bat_SR    Team1.BowlRate  Team1.Bowl_SR   Team2.BatRate   Team2.Bat_SR    Team2.BowlRate  Team2.Bowl_SR   Team1.Result
1   533280  New Zealand 13 runs NA  1   1   Pallekele   23-Sep-12   18.96866667 114.3413333 20.67066667 15.27333333 17.10866667 111.3693333 13.97666667 12.14666667 1
2   533283  Bangladesh  8 wickets   8   0   2   Pallekele   25-Sep-12   14.41333333 111.9113333 23.82466667 17.00666667 17.10866667 111.3693333 13.97666667 12.14666667 1
3   533286  South Africa    2 wickets   2   0   2   Colombo (RPS)   28-Sep-12   17.10866667 111.3693333 13.97666667 12.14666667 21.862  116.5413333 21.29266667 15.46   1
4   533291  India   8 wickets   18  1   1   Colombo (RPS)   30-Sep-12   22.37   104.772 25.52333333 19.29333333 17.10866667 111.3693333 13.97666667 12.14666667 0
5   533294  Australia   32 runs NA  0   1   Colombo (RPS)   2-Oct-12    18.36066667 114.2273333 22.80333333 18.42   17.10866667 111.3693333 13.97666667 12.14666667 1
6   533296  Sri Lanka   16 runs NA  0   2   Colombo (RPS)   4-Oct-12    17.10866667 111.3693333 13.97666667 12.14666667 15.936  100.616 15.75333333 13.16   0
7   562438  Sri Lanka   23 runs NA  1   1   Hambantota  3-Jun-12    14.425  98.111875   11.86875    10.33125    17.51142857 105.8635714 16.23214286 12.87857143 1

是否可以将 NaN 作为结果,因为它对我来说似乎不正确。属性也可以是 1,就像卡方中的日期一样?

4

0 回答 0