2

我有 155 个图像和 8 个类,前提是这些特征没有在 [0-1] 范围内缩放。

网格搜索交叉验证建议我使用线性内核和 C = 1000 的分数:

         precision    recall  f1-score   support

      1       0.54      0.88      0.67         8
      2       0.73      1.00      0.84         8
      3       1.00      1.00      1.00         6
      4       0.75      0.50      0.60        12
      5       0.83      0.83      0.83         6
      6       0.92      0.65      0.76        17
      7       0.71      0.42      0.53        12
      8       0.60      1.00      0.75         9

avg / total       0.77      0.73      0.72        78 

但是当我尝试线性内核和 C=1000 时,我得到:

         precision    recall  f1-score   support

      1       0.00      0.00      0.00         0
      2       1.00      0.70      0.82        10
      3       1.00      1.00      1.00        13
      4       0.73      0.58      0.65        19
      5       1.00      0.95      0.97        19
      6       0.96      0.88      0.92        25
      7       0.82      0.67      0.73        27
      8       0.70      1.00      0.82        16

avg / total       0.88      0.81      0.84       129


Confusion matrix:
[[ 0  0  0  0  0  0  0  0]
[ 0  7  0  0  0  0  3  0]
[ 0  0 13  0  0  0  0  0]
[ 2  0  0 11  0  1  0  5]
[ 0  0  0  1 18  0  0  0]
[ 0  0  0  0  0 22  1  2]
[ 6  0  0  3  0  0 18  0]
[ 0  0  0  0  0  0  0 16]]

为什么第 1 类全为零?

我还看到,使用 rbf 内核我得到了最好的结果,但在头等舱中总是为零:

         precision    recall  f1-score   support

      1       0.00      0.00      0.00         0
      2       1.00      1.00      1.00        10
      3       1.00      1.00      1.00        13
      4       0.94      0.89      0.92        19
      5       1.00      0.95      0.97        19
      6       0.93      1.00      0.96        25
      7       1.00      0.78      0.88        27
      8       1.00      1.00      1.00        16

avg / total       0.98      0.93      0.95       129


Confusion matrix:
[[ 0  0  0  0  0  0  0  0]
 [ 0 10  0  0  0  0  0  0]
 [ 0  0 13  0  0  0  0  0]
 [ 1  0  0 17  0  1  0  0]
 [ 0  0  0  1 18  0  0  0]
 [ 0  0  0  0  0 25  0  0]
 [ 5  0  0  0  0  1 21  0]
 [ 0  0  0  0  0  0  0 16]]

最后,当我尝试预测训练集的一些相同图像时

print(clf.predict(fv))

其中 fv 是图像特征向量:

[0.16666666666628771, 5.169878828456423e-26, 2.3475644278196356e-21, 1.0, 1.0000000000027285]

并将错误的类分配给特征向量!(即图像拥有 4 类,但 predict() 结果是 5 类)

重新更新

图片集:https ://docs.google.com/file/d/0ByS6Z5WRz-h2V3RkejFkb21Fb0E/edit?usp=sharing

功能图像集:https ://docs.google.com/file/d/0ByS6Z5WRz-h2YlhuUmFBaElXVEE/edit?usp=sharing

完整代码:

import os
import glob
import numpy as np
from numpy import array
import cv2

target = [      1,1,1,1,
          1,1,1,1,1,1,1,
          1,1,1,1,1,1,1,
          1,2,2,2,2,2,2,
          2,2,2,2,2,2,2,
          2,2,2,2,3,3,3,
          3,3,3,3,3,3,3,
          3,3,3,4,4,4,4,
          4,4,4,4,4,4,4,
          4,4,4,4,4,4,4,        
          4,5,5,5,5,5,5,
          5,5,5,5,5,5,5,
          5,5,5,5,5,5,6,
          6,6,6,6,6,6,6,
          6,6,6,6,6,6,6,
          6,6,6,6,6,6,6,
          6,6,6,7,7,7,7,               
          7,7,7,7,7,7,7,
          7,7,7,7,7,7,7,
          7,7,7,7,7,7,7,                  
          7,7,8,8,8,8,8,
          8,8,8,8,8,8,8,
          8,8,8,8]



features = [ [0.26912666717306399, 0.012738398606387012, 0.011347858467581035, 0.1896938013442868, 2.444553429782046]
,
[0.36793086934925351, 0.034364344308391102, 0.019054536791551006, 0.0076875387476751395, 3.03091214703604]
,
[0.36793086934925351, 0.034364344308391102, 0.019054536791551006, 0.0076875387476751395, 3.03091214703604]
,
[0.30406240228443038, 0.047100329090555518, 0.0049653458889261448, 0.0004618404341300081, 5.987025009738751]
,
[0.36660353297714748, 0.034256126367653919, 0.01892501331178556, 0.007723901183105499, 3.0392760101225234]
,
[0.26708884220978957, 0.012126741224471632, 0.0063753119877062942, 0.0005937801528983894, 2.403113171408598]
,
[0.27070254516425241, 0.01293684867974746, 0.01159661796151442, 0.008380724334031727, 2.4492688425144986]
,
[0.27076540467770038, 0.012502407901054009, 0.011180048331833999, 0.0007116977225672878, 2.4068989750876266]
,
[0.22832314403919951, 0.010491475428909061, 0.0027317652016312271, 0.001417434443656981, 2.6271926274711968]
,
[0.22374814412737717, 0.0095258889624651646, 0.0040833924467236719, 0.1884906960716747, 2.5474055920602514]
,
[0.23860556210266026, 0.0067860933136106557, 0.0052050705189953389, 0.01498751040799334, 2.0545849084769694]
,
[0.32849751530034654, 0.0082079572128769367, 0.017950580842136479, 0.07211170619739862, 1.761646715256231]
,
[0.3536962871782694, 0.04335618127793292, 0.0084705562859388305, 0.003939815915497741, 3.8626463078353632]
,
[0.23642964900011443, 0.0060530993708264348, 0.0041172882106328976, 0.003276003276003276, 1.9809324414862304]
,
[0.35468301957048581, 0.043735489028639378, 0.0085420200506240735, 0.00041124057573680605, 3.873602628153773]
,
[0.35549112610207528, 0.043992218599656373, 0.0086354414147218166, 0.004276259969455286, 3.8781644572829106]
,
[0.97303451800669749, 0.075165987107118692, 0.23350656471824954, 0.04989418850724402, 1.7845923298199189]
,
[0.32292438991638828, 0.0078312712861588109, 0.018256154769458615, 0.05861489639723726, 1.754975905310628]
,
[0.36415716731096714, 0.033783635359516562, 0.0087048690616182353, 0.0007989674881691353, 3.0382507494699778]
,
[0.23247799686964493, 0.023970481957641395, 0.0020180739588722754, 0.2511737089201878, 4.987537342956105]
,
[0.25249755819322928, 0.03355835554037629, 0.0024745974458906918, 0.49168600154679043, 6.286228850887637]
,
[0.25524836990657951, 0.035216193154545015, 0.0023524820730296808, 0.49272798742138363, 6.553001816315555]
,
[0.25226043727172792, 0.033580607886770704, 0.002399474603048905, 0.4913428241631397, 6.310803986284148]
,
[0.2552359153348957, 0.034993472521483299, 0.0024465696242431606, 0.49311565696302123, 6.488164071764478]
,
[0.25249755819322928, 0.03355835554037629, 0.0024745974458906918, 0.49168600154679043, 6.286228850887637]
,
[0.19296658297366265, 0.0073667093687413854, 0.0010128002719554498, 0.20292887029288703, 2.6022382484976103]
,
[0.23130715659438109, 0.023652143308649062, 0.0020734509865596379, 0.2519981194170193, 4.96809084167716]
,
[0.23646940610897133, 0.025909457534721684, 0.0019634358569802723, 0.25097465886939574, 5.263654156113397]
,
[0.61892415483059771, 0.1855733578950316, 0.024118739298890277, 0.00010742003920831431, 5.579333799263049]
,
[0.61892415483059771, 0.1855733578950316, 0.024118739298890277, 0.00010742003920831431, 5.579333799263049]
,
[0.62187109165606835, 0.18810005977070685, 0.060143785970969831, 0.005752046658462197, 5.609811692923419]
,
[0.64410628333823972, 0.20178318336365086, 0.039546324622261202, 8.006565383614564e-05, 5.609490756132282]
,
[0.6214309265075304, 0.18779664186718673, 0.061337975720487534, 0.006350402281839464, 5.608301926807521]
,
[0.20135445416653119, 0.0070220507238874311, 0.0027092098815647042, 0.4125833006664053, 2.4256545571324732]
,
[0.20123494853445922, 0.0069845347246147793, 0.0027020357704780201, 0.4106724003127443, 2.420576584506546]
,
[0.2015816556223165, 0.0070631416111702362, 0.0025149608542164329, 0.4106073986851143, 2.4300340608128606]
,
[0.70115857527896985, 0.35625759453714789, 0.028386898853323388, 0.001234186979327368, 12.446918085552586]
,
[0.68366020888533297, 0.2387861974848598, 0.04047049559400958, 0.0725675987982436, 6.011803834536788]
,
[0.70115857527896985, 0.35625759453714789, 0.028386898853323388, 0.001234186979327368, 12.446918085552586]
,
[0.71378846605495283, 0.37185054375086962, 0.078338189105938844, 0.4899937460913071, 12.727628852581882]
,
[0.72219309919241148, 0.37567368174335658, 0.029371875736917675, 0.48066298342541436, 12.21840343375]
,
[0.84033907078880576, 0.29025638999406633, 0.090118665350957639, 0.00013319126265316994, 4.572824986179928]
,
[0.84033907078880576, 0.29025638999406633, 0.090118665350957639, 0.00013319126265316994, 4.572824986179928]
,
[0.84078478547550572, 0.28881268265635862, 0.092759120470064349, 0.0005334044539271903, 4.542932448095888]
,
[0.86195880470328134, 0.31149212664075476, 0.090341088591145105, 0.00044657097288676234, 4.673692966632184]
,
[0.85542893012496013, 0.29898764801731947, 0.17279563533793374, 0.0005314202205393915, 4.543371196521408]
,
[0.68653873117620423, 0.24135977292901584, 0.031609483792605572, 0.4553053169259345, 6.032229402405299]
,
[0.68937407444389065, 0.2429428175127194, 0.031783181019183315, 0.07118412046543464, 6.017180801429501]
,
[0.66262362984605561, 0.22830191525650573, 0.027222059698182095, 0.4712353884941554, 6.170703008647743]
,
[0.85191326598415906, 0.0066280315423251869, 0.18568977018064967, 0.24070082098793744, 1.211324246965761]
,
[0.41763663758743241, 0.0042550997098748248, 0.01052268995786553, 0.000998003992015968, 1.3702049090803978]
,
[0.47955540731641061, 0.036031336698149265, 0.0037552308556160824, 0.41911764705882354, 2.3102900509255964]
,
[0.28510645493450759, 0.017800467984914338, 0.0013560744373383752, 0.6212718064153067, 2.7591153064421485]
,
[0.28093855472961832, 0.017019535454492932, 0.0025233674347249074, 0.6243626062322947, 2.733908520445971]
,
[0.28510645493450759, 0.017800467984914338, 0.0013560744373383752, 0.6212718064153067, 2.7591153064421485]
,
[0.29957424000441979, 0.020997289413265056, 0.0032514165703168524, 0.002352941176470588, 2.8737257187232768]
,
[0.28093855472961832, 0.017019535454492932, 0.0025233674347249074, 0.6243626062322947, 2.733908520445971]
,
[0.94384505611284442, 0.0070361165614443756, 0.17778161251377933, 0.00013138014845956775, 1.1950816827585424]
,
[1.2480442396269933, 0.013169393067805945, 0.37414805554448649, 0.0018769272020378066, 1.202522486580245]
,
[0.82815785035628164, 0.0071847611802335776, 0.17226935935994725, 0.24680054800013365, 1.2280429227515923]
,
[0.55468014442636804, 0.04844726528488761, 0.074669093941655343, 0.3799483919692869, 2.3157520760049994]
,
[0.85603162865577076, 0.010190325204698992, 0.14635589096917062, 0.00018691588785046728, 1.2673797230628077]
,
[0.55881837183305305, 0.048068057730781634, 0.06639403930381195, 0.3722541921910773, 2.291289872230647]
,
[0.55650701031519434, 0.047379164870780005, 0.075834025272625227, 0.3768812839567851, 2.2847828255276856]
,
[0.59736941845983627, 0.054964632904472815, 0.089651232352172761, 0.0002190940461192967, 2.291980379225357]
,
[0.55468014442636804, 0.04844726528488761, 0.074669093941655343, 0.3799483919692869, 2.3157520760049994]
,
[0.37385965430511475, 0.019136318061858774, 0.0017515265254845647, 0.002456248081056187, 2.1746841721523915]
,
[0.3755068478409902, 0.019166948350188812, 0.0045621553498242356, 0.4868705591597158, 2.1680040687479902]
,
[0.376117657056177, 0.020048016077051325, 0.004081551918441755, 0.48440424204616345, 2.20746211913412]
,
[0.18567611209815035, 0.0017735326711233123, 0.00026719643703200545, 0.37649076434123163, 1.5866887090683386]
,
[0.15935887794419157, 3.0968737461516311e-05, 4.6106803792004044e-06, 7.109594397639615e-05, 1.0723690004464064]
,
[0.1598493732922015, 9.6513614204532248e-05, 1.4807540465080871e-05, 0.020011435105774727, 1.130966420539851]
,
[0.15976502679964721, 9.179670697435723e-05, 1.1098997372160861e-05, 0.027888446215139442, 1.127590980529105]
,
[0.15948519514589277, 8.8904788108173233e-05, 3.0493405326069049e-07, 0.825754804580883, 1.1256719774569757]
,
[0.16617638537179313, 0.0020240604885197228, 3.5948671354276501e-05, 0.00017182868679926113, 1.7424826840700272]
,
[0.16617882105231332, 0.002010285330985506, 3.1650697838912209e-05, 0.00017161489617298782, 1.7390017992958084]
,
[0.16601904246228144, 0.001959487143766989, 3.2733987503779933e-05, 0.10968404829180581, 1.7271461688896599]
,
[0.16628339469915165, 0.0020643314471593802, 1.4502279324313873e-05, 0.14276914653343373, 1.7519319117125625]
,
[0.16629298316796565, 0.0020800819965552542, 1.9020907349023509e-05, 0.13840607699240376, 1.755817053262183]
,
[0.18572210382333143, 0.0018178104959919194, 0.0002453722722107162, 6.292672183242613e-05, 1.5959450271122788]
,
[0.78164051870269824, 0.051523793666842309, 0.015067726988898911, 4.814636494944632e-05, 1.818489926889651]
,
[0.18566012446433577, 0.0017919804956179246, 0.00018368826559889194, 0.3746835841076679, 1.590696751465318]
,
[0.1593593872646801, 3.0965616570412022e-05, 4.7608077176119086e-06, 0.013757065159432655, 1.072364982247259]
,
[0.15935971192682988, 3.4228786893989237e-05, 2.8175989802780335e-06, 0.011385902663771647, 1.0762239433773122]
,
[0.1593758710624088, 3.1730097257658988e-05, 6.5545372607421827e-06, 0.19480358030830433, 1.0732774861268992]
,
[0.15935651884191823, 3.2075768916173883e-05, 2.6894443902692268e-06, 0.011169712144620248, 1.0736994974496823]
,
[0.1593593872646801, 3.0965616570412022e-05, 4.7608077176119086e-06, 0.013757065159432655, 1.072364982247259]
,
[0.72806364396184653, 0.080927033958709829, 0.082024727906757688, 0.0003304829181641674, 2.282620340759594]
,
[0.34064008340950969, 0.031713563937392303, 0.0223935905703848, 0.5525150905432595, 3.191021756804023]
,
[0.34161716425171257, 0.032414962195661444, 0.023399763826767502, 0.5634559735427863, 3.228573480379]
,
[0.33995795036914717, 0.032291160309302944, 0.014503695651070611, 0.5517519130084575, 3.2425659662137543]
,
[0.53755813910874839, 0.12514260672326116, 0.047097530510313457, 0.0022522522522522522, 4.849281676080233]
,
[0.53892887245870857, 0.12723100136939183, 0.047871070696486759, 0.0003630422944273008, 4.914680204854179]
,
[0.52941013268525083, 0.12033870626971493, 0.044950934295866135, 0.00036251586006887804, 4.801391369341545]
,
[0.5153795221866847, 0.11396653431855266, 0.046028411270117815, 0.0017374383209396067, 4.797613736965006]
,
[0.55889931613495802, 0.13776801275023373, 0.054206231614929122, 0.0003675794890645102, 4.954346523167349]
,
[0.53892887245870857, 0.12723100136939183, 0.047871070696486759, 0.0003630422944273008, 4.914680204854179]
,
[0.53876191407701801, 0.12675358533640296, 0.048092146277654686, 0.0003630422944273008, 4.896575690597256]
,
[0.64579700029686937, 0.053345962571719745, 0.047671705312373282, 0.00021581957483543757, 2.1135534993967275]
,
[0.52907834506993823, 0.11839951044942501, 0.046693278117526091, 0.001802451333813987, 4.720197357775248]
,
[0.62431811267333093, 0.16822847351832676, 0.078460359627903944, 0.0002954864445593558, 4.830349593161275]
,
[0.52957671831590236, 0.1206620716356978, 0.044424337085019652, 0.00036251586006887804, 4.812745400588476]
,
[0.64778861076667615, 0.011264454903514588, 0.26034582337509793, 0.00017355085039916696, 1.3918887090929497]
,
[0.64767923033014785, 0.011511416466409427, 0.26619423461723268, 0.0001713355606956224, 1.3970897837418754]
,
[0.64175254514795532, 0.051344373338613858, 0.047562712202626603, 0.0015838339705079192, 2.091594563276403]
,
[0.74328372556577627, 0.069102582620664751, 0.082952746646336797, 0.0001621665450417579, 2.094372254494601]
,
[0.63983023392719118, 0.050957609005336219, 0.04065234770126492, 0.0002180787264202377, 2.0902782497935077]
,
[0.64175254514795532, 0.051344373338613858, 0.047562712202626603, 0.0015838339705079192, 2.091594563276403]
,
[0.39929495902359424, 0.088487529110910193, 0.022225937358985204, 0.0016210739614994933, 6.842658946475011]
,
[0.40318161986196532, 0.091372930642081962, 0.029342259032521321, 0.0016383370878558263, 6.991543657993919]
,
[0.40286945787178563, 0.092489700223200605, 0.029477042699685527, 0.0008298755186721991, 7.159524821606994]
,
[0.401527045553835, 0.092940206887656154, 0.022384335964308343, 0.0008262755629002272, 7.307506331212089]
,
[0.48221520941584561, 0.080925707098030486, 0.01508266157389335, 0.016811768237766436, 3.877246216887803]
,
[0.23300739937344839, 0.0081726649803679097, 0.00070589920573164966, 0.7233009708737864, 2.267880404181219]
,
[0.4889793426754816, 0.13379642486830962, 0.0079207484968624713, 0.0012550988390335738, 6.938143452247703]
,
[0.50805268679046123, 0.15157146566770596, 0.002286367854475147, 0.0015261350629530714, 7.558054436128668]
,
[0.50504588069443601, 0.14372144884265609, 0.002332370870321935, 0.4888972525404592, 7.020435652999047]
,
[0.49053398407596349, 0.13596678236015974, 0.0068673835378752004, 0.5062523683213338, 7.054927383254023]
,
[0.27047698059047881, 0.02400759815979293, 0.0042725763257732184, 0.1406003159557662, 3.6822411994354223]
,
[0.67217292360607472, 0.21411359416298198, 0.038240138048085716, 0.00030014256771966684, 5.418493234141116]
,
[0.66809561834310183, 0.20843134771175456, 0.055569614057154701, 0.0005965697240865026, 5.316112363334643]
,
[0.69764902288163122, 0.23441611695166623, 0.040989861350760971, 0.00030097817908201655, 5.535854638867057]
,
[0.69337536416934831, 0.23122440548075349, 0.039932976305992858, 0.0011285832518245428, 5.5253522283788445]
,
[0.48053616103332131, 0.078827555080480394, 0.014699769292604886, 0.00040342914775592535, 3.810804845605404]
,
[0.51893243284454049, 0.14486098229876093, 0.007011404157031503, 0.0013995801259622112, 6.503015780005906]
,
[0.51611281879296478, 0.14397569681830566, 0.0063953861901166996, 0.0024067388688327317, 6.552602133840095]
,
[0.52265570318341037, 0.14786059553298658, 0.021856594872657918, 0.002438599547117227, 6.567632701584826]
,
[0.30079480228240624, 0.022512205511218238, 0.00042758792096778651, 0.016516516516516516, 2.990535008572801]
,
[0.30656959740479811, 0.025225633729599333, 0.00052074639660009423, 0.014692653673163419, 3.1500163953105362]
,
[0.36561931104389456, 0.034065616542602442, 0.00073193209081989026, 0.5295319844676067, 3.0388637406298646]
,
[0.30523253105219622, 0.024888851231432006, 0.00049965741600376489, 0.014692653673163419, 3.1395734571173244]
,
[0.30228106501925794, 0.02294279475480349, 0.00029015539686061685, 0.016315633343221597, 3.0087064225809246]
,
[0.48449572183350859, 0.08057148632400099, 0.014649379545360155, 0.0008072653884964682, 3.8293932305935345]
,
[0.48696620229608523, 0.082309882547938931, 0.015050994484143265, 0.0008004802881729037, 3.8679773897921153]
,
[0.28412339537248588, 0.026648939499942827, 0.0040253434951236042, 0.652089407191448, 3.7009800669447657]
,
[0.28496156479277329, 0.02656759057204762, 0.0040076364850396805, 0.6479146459747818, 3.672807600295908]
,
[0.27750673534987835, 0.024513847513161952, 0.0040536738369991365, 0.6610337972166997, 3.589249226795383]
,
[0.23076358836711391, 0.0081276558884353922, 0.0011346229787721842, 0.004830917874396135, 2.2823193871783753]
,
[0.23009954177415121, 0.0067688972295314211, 0.00050627342206410546, 0.7085308056872038, 2.1131083582556887]
,
[0.74667089537876641, 0.017808021782196932, 0.00058715813729321711, 0.20097746402389358, 1.4352297290097118]
,
[0.46459021914407012, 0.015923283050662724, 0.0096104956720461029, 0.07748745012228087, 1.7457829468097172]
,
[0.46915422400128481, 0.016432642899682152, 0.019842029469490614, 0.07962922414422305, 1.75192535048515]
,
[0.46603526212803831, 0.014906446836800192, 0.034027862564102791, 0.00017277871366247678, 1.709953995454109]
,
[0.74667089537876641, 0.017808021782196932, 0.00058715813729321711, 0.20097746402389358, 1.4352297290097118]
,
[0.74667089537876641, 0.017808021782196932, 0.00058715813729321711, 0.20097746402389358, 1.4352297290097118]
,
[0.79130699331490484, 0.018730652999112182, 0.0025843522081647448, 0.18977700753966478, 1.4182461597025509]
,
[0.78526444941147622, 0.019630664985282237, 0.0014735307445837577, 0.19151016964319956, 1.43434362458419]
,
[1.2360091274851985, 0.11319166323186233, 0.037129035449204553, 0.13274704929414488, 1.7480015935515811]
,
[1.2379748172284306, 0.11372770880048684, 0.03880647583352842, 0.1327272446632812, 1.748796779378963]
,
[1.0687065973690613, 0.06124884507730273, 1.0261941877753638, 0.0006237784339002786, 1.6027241684534652]
,
[1.0719786963564104, 0.066016997091209076, 0.67377325492164508, 0.000951317367746205, 1.6304903448777237]
,
[1.0726105544893461, 0.060421845064782209, 0.68185690192755832, 0.0016887717274899085, 1.5946007712754786]
,
[0.46459021914407012, 0.015923283050662724, 0.0096104956720461029, 0.07748745012228087, 1.7457829468097172]
,
[0.46578048886757484, 0.014968271641939066, 0.034797974069617273, 0.009791711402190251, 1.7124764457384176]
,
[0.47125303755595782, 0.015662651510987502, 0.0092152255656893708, 0.00017202081451855674, 1.723199095190638]
             ]




############################ PREDICTION TEST 1 IMAGE ################
print("TRY IMAGE")
import numpy as np
from sklearn import svm, metrics
X = features
y = target
from sklearn.svm import SVC
C = 1000.0
clf = svm.SVC(kernel='rbf', C=C).fit(X, y)
#svm.SVC(kernel='linear', C=C).fit(X, y) #SVC()
#clf.fit(X, y)
print("predizione")

#fv is class 8 but show me 5
fv = [0.16666666666628771, 5.169878828456423e-26, 2.584939414228212e-22, 1.0, 1.0000000000027285]
print(fv)
print(clf.predict([fv]))




############### METRICS ##########


# We learn the digits on the first half of the digits


# Now predict the value of the digit on the second half:
import matplotlib.pyplot as plt

expected = y[26:]
predicted = clf.predict(X[26:])
print("expected")
print(len(expected))
print("predicted")
print(len(predicted))

print "Classification report for classifier %s:\n%s\n" % (
    clf, metrics.classification_report(expected, predicted))
print "Confusion matrix:\n%s" % metrics.confusion_matrix(expected, predicted)
4

1 回答 1

3

您在完整数据集上训练模型,然后在训练集的子集上计算得分,即数据集的所有末尾,除了 26 个第一个样本,其中包括来自类 0 的整个样本集。

您不能以这种方式评估模型:您需要随机打乱数据,然后在训练模型之前拆分训练集和测试集(否则整个数据集就是训练集,而您没有单独的测试集)。如果你这样做:

import numpy as np
from sklearn import svm, metrics
from sklearn.cross_validation import train_test_split
from sklearn.svm import SVC

X = features
y = target

X_train, X_test, y_train, y_test = train_test_split(X, y,
        test_size=0.25, random_state=42)

C = 1000.0
clf = svm.SVC(kernel='rbf', C=C).fit(X_train, y_train)
y_predicted = clf.predict(X_test)

print "Classification report for classifier %s:\n%s\n" % (
    clf, metrics.classification_report(y_test, y_predicted))
print "Confusion matrix:\n%s" % metrics.confusion_matrix(y_test, y_predicted)

print "Predicting on 1 sample"
print "Input features:"
fv = [0.16666666666628771, 5.169878828456423e-26, 2.584939414228212e-22, 1.0, 1.0000000000027285]
print fv
print "Predicted class index:"
print clf.predict([fv])

您将获得以下输出:

Classification report for classifier SVC(C=1000.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
  gamma=0.0, kernel=rbf, max_iter=-1, probability=False, shrinking=True,
  tol=0.001, verbose=False):
             precision    recall  f1-score   support

          1       0.50      0.25      0.33         4
          2       0.75      1.00      0.86         6
          3       1.00      1.00      1.00         2
          4       0.75      1.00      0.86         3
          5       1.00      0.88      0.93         8
          6       1.00      1.00      1.00         5
          7       0.75      0.75      0.75         8
          8       1.00      1.00      1.00         3

avg / total       0.84      0.85      0.83        39


Confusion matrix:
[[1 1 0 0 0 0 2 0]
 [0 6 0 0 0 0 0 0]
 [0 0 2 0 0 0 0 0]
 [0 0 0 3 0 0 0 0]
 [0 0 0 1 7 0 0 0]
 [0 0 0 0 0 5 0 0]
 [1 1 0 0 0 0 6 0]
 [0 0 0 0 0 0 0 3]]
Predicting on 1 sample
Input features:
[0.1666666666662877, 5.169878828456423e-26, 2.584939414228212e-22, 1.0, 1.0000000000027285]
Predicted class index:
[5]

当然,这是一个单一的随机训练/测试拆分,并且由于您的数据集非常小,您获得的分数估计值会受到很大的方差。您可以通过迭代交叉验证计算此模型类和参数集的预期平均分数的估计值:

from sklearn.cross_validation import ShuffleSplit
from sklearn.cross_validation import cross_val_score
from scipy.stats import sem

params = dict(kernel='rbf', C=1000)
clf = svm.SVC(**params)
cv = ShuffleSplit(X.shape[0], n_iter=50)
cv_scores = cross_val_score(clf, X, y, cv=cv)

这将输出:

print "Cross Validated test scores for SVC with params {0} on full dataset:".format(params)
print "Mean: {0:.3} +/-{1:.3}".format(np.mean(cv_scores), sem(cv_scores))
print "Standard deviation: {0:.3}".format(np.std(cv_scores))

Cross Validated test scores for SVC with params {'kernel': 'rbf', 'C': 1000} on full dataset:
Mean: 0.834 +/-0.0125
Standard deviation: 0.0872

因此,您可以合理地期望总体上具有 83% 的预测准确度(或者略高一点,因为 CV 程序被低估了一点)。

如果您想显着提高这种性能水平,我的第一个建议是收集更多标记样本以获得更大的数据集。

第二个建议是通过对原始图像应用小的扰动(例如小的平移、旋转和一点均匀的随机噪声)从现有的数据中生成更多的标记数据,以便通过提取从现有图像中生成更多的标记数据这些额外样本的特征。

编辑:补充问题:

我还遗漏了 8/10 图像样本,因为我认为它们不属于任何类别。

您可能应该为不属于其他先前类的所有图像添加一个名为“其他”的附加类别。

我应该为每个类添加一个新类并通过小平移旋转创建新样本?

不,目标是通过从现有样本中构建新样本来为每个类添加更多样本,从而提高现有类的分类精度。

我收到了这个错误:TypeError: init () got an unexpected keyword argument 'n_iter' at this line cv = ShuffleSplit(X.shape[0], n_iter=50)

n_iter是 0.13 版本中的新名称。在 0.12 中是n_iterations

http://scikit-learn.org/0.12/modules/generated/sklearn.cross_validation.ShuffleSplit.html

于 2013-02-20T09:51:21.347 回答