apache-spark - PySpark 没有可用于 Word2VecModel 的 getVectors 方法

Question

我正在尝试访问getVectors()pyspark 1.2.0 Spark 版本的方法，但 pyspark 状态 -

input.cache()
word2vec = Word2Vec()
model = word2vec.fit(input)
vector = model.getVectors()

AttributeError: 'Word2VecModel' object has no attribute 'getVectors'

所以我只有使用 Scala/Java 访问它的方法，或者我可以做些什么。

score 0 · Accepted Answer

0

getVectors 在 1.4 版中被添加到 pyspark

于 2020-10-27T22:08:16.520 回答

score 0 · Accepted Answer

我正在尝试使用在 Spark 1.2 上运行的玩具文件创建可重现的代码

# cat data.txt
crazy crazy fox jumped
crazy fox jumped
fox is fast
fox is smart
dog is smart

>> lines = sc.textFile('data.txt', 1);
>> lines.collect()
[
 u'crazy crazy fox jumped', 
 u'crazy fox jumped', 
 u'fox is fast', 
 u'fox is smart', 
 u'dog is smart'
]

from pyspark.mllib.feature import Word2Vec       
model = Word2Vec().fit(lines)

现在，如果我运行dir(model)输出是

['__class__',
 '__del__',
 '__delattr__',
 '__dict__',
 '__doc__',
 '__format__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__module__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_java_model',
 '_sc',
 'call',
 'findSynonyms',
 'transform']

其中最后 3 个是成员方法，而 getvectors() 不是其中的一部分。

apache-spark - PySpark 没有可用于 Word2VecModel 的 getVectors 方法

2 回答 2

Related

Reference