我正在使用 deeplearning4j 并且不明白如何从神经网络中获取分类向量的文本段落。
我只能得到分类率。
这是我的代码:
public static void main(String[] args) throws Exception {
ClassPathResource resource = new ClassPathResource("paravec/recortes");
LabelAwareIterator iterator = new FileLabelAwareIterator.Builder()
.addSourceFolder(resource.getFile()).build();
TokenizerFactory t = new DefaultTokenizerFactory();
t.setTokenPreProcessor(new CommonPreprocessor());
ParagraphVectors paragraphVectors = new ParagraphVectors.Builder()
.learningRate(0.025).minLearningRate(0.001).batchSize(1000)
.epochs(10).iterate(iterator).trainWordVectors(true)
.tokenizerFactory(t).build();
paragraphVectors.fit();
ClassPathResource unlabeledResource = new ClassPathResource(
"paravec/caderno");
FileLabelAwareIterator unlabeledIterator = new FileLabelAwareIterator.Builder()
.addSourceFolder(unlabeledResource.getFile()).build();
MeansBuilder meansBuilder = new MeansBuilder(
(InMemoryLookupTable<VocabWord>) paragraphVectors
.getLookupTable(),
t);
LabelSeeker seeker = new LabelSeeker(iterator.getLabelsSource()
.getLabels(),
(InMemoryLookupTable<VocabWord>) paragraphVectors
.getLookupTable());
while (unlabeledIterator.hasNextDocument()) {
LabelledDocument document = unlabeledIterator.nextDocument();
//how to get text paragraph?
INDArray documentAsCentroid = meansBuilder
.documentAsVector(document);
}
}
谢谢!雷南。