7

我目前正在做一个项目,我必须提取用户的面部表情(一次只能从网络摄像头中提取一个用户),比如悲伤或快乐。

我对面部表情进行分类的方法是:

  • 使用opencv检测图像中的人脸
  • 使用 ASM 和 stasm 获取面部特征点

面部标志

现在我正在尝试进行面部表情分类

SVM 是一个不错的选择吗?如果是我如何从 SVM 开始:

我将如何使用这个地标为每种情绪训练 svm?

4

3 回答 3

7

是的,SVM 已被无数次证明在这项任务中表现良好。已有数十篇(如果不是数百篇)描述此类程序的论文。

例如:

SVM 本身的一些基本来源可以在http://www.support-vector-machines.org/上获得(如书名、软件链接等)

如果您只是对使用它们感兴趣而不是了解它们,您可以获得基本库之一:

于 2013-09-05T17:18:33.717 回答
3

if you are already using opencv,i suggest you use the built in svm implementation, training/saving/loading in python is as follow. c++ has corresponding api to do the same in about the same amount of code. it also has 'train_auto' to find best parameters

import numpy as np
import cv2

samples = np.array(np.random.random((4,5)), dtype = np.float32)
labels = np.array(np.random.randint(0,2,4), dtype = np.float32)

svm = cv2.SVM()
svmparams = dict( kernel_type = cv2.SVM_LINEAR, 
                       svm_type = cv2.SVM_C_SVC,
                       C = 1 )

svm.train(samples, labels, params = svmparams)

testresult = np.float32( [svm.predict(s) for s in samples])

print samples
print labels
print testresult

svm.save('model.xml')
loaded=svm.load('model.xml')

and output

#print samples
[[ 0.24686454  0.07454421  0.90043277  0.37529686  0.34437731]
 [ 0.41088378  0.79261768  0.46119651  0.50203663  0.64999193]
 [ 0.11879266  0.6869216   0.4808321   0.6477254   0.16334397]
 [ 0.02145131  0.51843268  0.74307418  0.90667248  0.07163303]]
#print labels
[ 0.  1.  1.  0.]
#print testresult
[ 0.  1.  1.  0.]    

so you provide the n flattened shape models as samples and n labels and you are good to go. you probably dont even need the asm part, just apply some filters which are sensitive to orientation like sobel or gabor and concatenate the matrices and flatten them then feed them directly to svm. you probably can get maybe 70-90% accuracy.

as someone said cnn are an alternative to svms.here's some links that implement lenet5. so far,i find svms much simpler to get started.

https://github.com/lisa-lab/DeepLearningTutorials/

http://www.codeproject.com/Articles/16650/Neural-Network-for-Recognition-of-Handwritten-Digi

-edit-

landmarks are just n (x,y) vectors right? so why dont you try put them into a array of size 2n and simply feed them directly to the code above?

for example,3 training samples of 4 land marks (0,0),(10,10),(50,50),(70,70)

samples = [[0,0,10,10,50,50,70,70],
[0,0,10,10,50,50,70,70],
[0,0,10,10,50,50,70,70]]

labels=[0.,1.,2.]

0=happy

1=angry

2=disgust

于 2013-09-05T21:22:39.583 回答
0

您可以检查代码以了解如何使用 SVM 完成此操作。

你可以在这里找到解释的算法

于 2013-09-08T15:53:35.097 回答