0

我想使用 word2vec 谷歌新闻语料库找到两个不同长度的句子之间的余弦相似度。这种方法可以让我找到两个相同长度的句子之间的余弦相似度,但是当长度不同时会抛出错误。

@RestController
public class word2vecsentence {
@Autowired
Word2VecModel wordVector;


 @RequestMapping(value="/sentsimilarity",method=RequestMethod.POST)
 public double cosineSimForSentence(@RequestParam("sent1") String sentence1,
 @RequestParam("sent2")String sentence2){
 Collection<String> label1 = Splitter.on(' ').splitToList(sentence1);
 Collection<String> label2 = Splitter.on(' ').splitToList(sentence2);

 WordVectors vector = wordVector.getModel();
 double consin = 0;
 try{
 INDArray array1 = vector.getWordVectorsMean(label1);
 System.out.println(array1);
 INDArray array2 = vector.getWordVectorsMean(label2);
 System.out.println(array2);
 consin = Transforms.cosineSim(array1, array2);
 return consin;
 }catch(Exception e){
 e.printStackTrace();
 return consin;
 }
 }}

谁能帮我解决这个问题?

4

0 回答 0