0

我正在尝试使用 Fisher 向量对图像进行分类,如下所述:Sánchez, J., Perronnin, F., Mensink, T., & Verbeek, J. (2013)。使用 Fisher 向量进行图像分类:理论与实践。国际计算机视觉杂志,105(3),222–245。http://doi.org/10.1007/s11263-013-0636-x

为了尝试和评估这种方法,我想使用 OpenIMAJ 库,因为根据他们的JavaDoc,他们正是使用这种方法来创建 Fisher 向量,但我无法让它工作。我尝试使用 OpenIMAJ 和 OpenCV 创建 SIFT 特征向量,但对于两者我都得到相同的错误:EM 算法永远无法计算给定初始参数的有效似然性。尝试不同的初始化参数(或增加 n_init)或检查退化数据。

如果有人已经使用过这种方法,我将不胜感激。我创建了一个小例子来说明问题:

    // load an image
    LocalFeatureList<Keypoint> findFeatures = new DoGSIFTEngine()
            .findFeatures(ImageUtilities
                    .readMBF(
                            new URL(
                                    "http://upload.wikimedia.org/wikipedia/en/2/24/Lenna.png"))
                    .flatten());

    // convert to double array
    double[][] data = new double[findFeatures.size()][findFeatures.get(0)
            .getDimensions()];
    for (int i = 0; i < findFeatures.size(); i++) {
        data[i] = findFeatures.get(i).getFeatureVector().asDoubleVector();
    }

    GaussianMixtureModelEM gaussianMixtureModelEM = new GaussianMixtureModelEM(
            64, CovarianceType.Diagonal);

    // error is thrown here
    MixtureOfGaussians estimate = gaussianMixtureModelEM.estimate(data);
4

1 回答 1

1

我猜问题是您没有使用足够的数据。要训​​练 GMM,您应该使用来自整个训练语料库的许多样本,而不是单个图像。另请注意,在学习 GMM 之前,您还应该应用 PCA 来降低特征的维度(这不是严格要求,但确实有助于提高性能,如您链接的论文中所示)。

完成此操作后,您可以使用 OpenIMAJ FisherVector 类从每个图像的 SIFT 点实际计算矢量。

顺便说一句 - 当您进行分类时,如果您想要任何体面的性能,您几乎肯定希望使用 DenseSIFT 变体而不是 DoG-SIFT。

下面是从 UKBench 数据集的前 100 个图像构建 FisherVectors 的示例代码:

    //Load features from disk
    final List<MemoryLocalFeatureList<FloatKeypoint>> data = new ArrayList<MemoryLocalFeatureList<FloatKeypoint>>();
    final List<FloatKeypoint> allKeys = new ArrayList<FloatKeypoint>();

    for (int i = 0; i < 100; i++) {
        final MemoryLocalFeatureList<FloatKeypoint> tmp = FloatKeypoint.convert(MemoryLocalFeatureList.read(
                new File(String.format("/Users/jsh2/Data/ukbench/sift/ukbench%05d.jpg", i)), Keypoint.class));
        data.add(tmp);
        allKeys.addAll(tmp);
    }

    //randomise their order
    Collections.shuffle(allKeys);

    //sample 1000 of them to learn the PCA basis with 64 dims
    final double[][] sample128 = new double[1000][];
    for (int i = 0; i < sample128.length; i++) {
        sample128[i] = ArrayUtils.convertToDouble(allKeys.get(i).vector);
    }

    System.out.println("Performing PCA " + sample128.length);
    final ThinSvdPrincipalComponentAnalysis pca = new ThinSvdPrincipalComponentAnalysis(64);
    pca.learnBasis(sample128);

    //project the 1000 training features by the basis (for computing the GMM)
    final double[][] sample64 = pca.project(new Matrix(sample128)).getArray();

    //project all the features by the basis, reducing their dimensionality
    System.out.println("Projecting features");
    for (final MemoryLocalFeatureList<FloatKeypoint> kpl : data) {
        for (final FloatKeypoint kp : kpl) {
            kp.vector = ArrayUtils.convertToFloat(pca.project(ArrayUtils.convertToDouble(kp.vector)));
        }
    }

    //Learn the GMM with 128 components
    System.out.println("Learning GMM " + sample64.length);
    final GaussianMixtureModelEM gmmem = new GaussianMixtureModelEM(128, CovarianceType.Diagonal);
    final MixtureOfGaussians gmm = gmmem.estimate(sample64);

    //build the fisher vector representations
    final FisherVector<float[]> fisher = new FisherVector<float[]>(gmm, true, true);

    int i = 0;
    final double[][] fvs = new double[100][];
    for (final MemoryLocalFeatureList<FloatKeypoint> kpl : data) {
        fvs[i++] = fisher.aggregate(kpl).asDoubleVector();
    }
于 2015-05-13T07:22:19.937 回答