0

我正在执行以下操作:

import spicy

nlp = spacy.load("en")

doc = nlp('Hello Stack Over Flow, my name is Steve')

doc.vector:

In [1]: doc = nlp('Hello Stack Over Flow, my name is Steve')

In [2]: doc.vector
Out[2]: 
array([ 1.67874452e-02,  1.43885329e-01, -1.64147541e-01, -3.52525562e-02,
        1.71078995e-01,  5.81666678e-02,  1.42294103e-02, -1.58536658e-01,
       -1.17119223e-01,  1.00338888e+00, -1.03455082e-01,  5.80027774e-02,
        5.08872233e-02, -2.64734793e-02, -4.76809964e-02, -3.61649990e-02,
       -4.25985567e-02,  4.86545563e-01, -5.22996634e-02,  2.66118869e-02,
       -7.14791119e-02,  2.33504437e-02, -1.01438001e-01,  1.78358995e-03,
        6.41188920e-02, -1.93965547e-02, -1.72182247e-02, -4.99197766e-02,
        3.82994451e-02,  2.89904438e-02,  1.10834874e-01,  1.07230783e-01,
        1.72666041e-03,  9.85269994e-02, -2.64622234e-02,  1.47332232e-02,
        1.49853658e-02, -3.25594470e-02, -2.28943750e-02, -6.28201067e-02,
       -4.13866527e-03,  4.12439965e-02, -1.09200180e-03, -3.77365127e-02,
        3.02788876e-02, -2.47912239e-02, -3.86282206e-02, -8.49756673e-02,
        8.79433304e-02, -7.35666696e-03, -2.35625561e-02,  1.29868105e-01,
       -8.24742168e-02,  3.79751101e-02,  6.52077794e-03,  4.12433175e-03,
       -4.44555469e-03, -8.54532197e-02,  4.30566669e-02, -4.90945578e-02,
        1.08687999e-02, -3.58653292e-02,  3.19277793e-02,  1.70548886e-01,
        7.04367757e-02, -1.03306666e-01, -6.25603348e-02, -4.16669573e-05,
       -9.90156457e-03,  4.87144403e-02, -6.59128875e-02,  2.21944507e-03,
        6.23853356e-02, -1.16886329e-02, -2.20711138e-02,  1.35971338e-01,
        5.85511066e-02, -2.78507806e-02, -4.42699976e-02,  1.22686662e-01,
       -4.96295579e-02,  8.47733300e-03, -1.72136649e-02,  3.73593345e-02,
        1.38313353e-01, -1.81285888e-01,  8.07836726e-02, -1.01186670e-01,
        1.90296680e-01, -8.37400090e-03, -4.79855575e-02,  4.62987460e-02,
        4.97333193e-03,  1.08253332e-02,  1.37178123e-01, -4.36927788e-02,
       -9.02644824e-03,  2.52826661e-02, -2.60283332e-02,  7.33327791e-02,
       -4.21555527e-02, -9.45088938e-02, -2.36399993e-02, -2.59645544e-02,
       -1.17972204e-02, -7.21249953e-02, -1.62978880e-02,  4.46572453e-02,
        8.05888604e-03,  1.73073336e-02, -1.11245394e-01, -1.35631096e-02,
        4.26412188e-02, -1.24742221e-02, -4.93782237e-02, -3.84650044e-02,
        9.32500139e-03, -2.58344412e-02,  5.39288903e-03, -2.51024440e-02,
       -1.68177821e-02,  1.81681886e-02,  6.95144460e-02,  5.96744493e-02,
        1.28178876e-02,  8.18611085e-02,  2.03688871e-02, -1.45592675e-01,
       -2.97091678e-02,  1.67966553e-03,  2.56901123e-02, -1.57507751e-02,
       -3.29821557e-02,  3.69144455e-02,  2.69458871e-02, -7.87097737e-02,
       -3.22544426e-02,  9.35557822e-04,  2.51506642e-02, -1.39920013e-02,
       -5.63631117e-01,  1.28184333e-01,  8.25011209e-02,  4.69026715e-02,
       -2.58401129e-02,  3.11454497e-02,  7.81277791e-02, -1.18433349e-02,
        2.19431128e-02,  2.38199951e-03, -2.19482221e-02,  5.75609989e-02,
        1.32304668e-01,  4.28974479e-02, -1.32128010e-02,  4.54772264e-02,
       -9.00077820e-02, -7.34564438e-02, -8.14672261e-02, -5.10835573e-02,
       -3.27358916e-02,  2.09213328e-02,  5.85612208e-02, -2.49340013e-02,
       -1.03430830e-01, -1.28346771e-01,  4.52880040e-02,  5.96577907e-03,
        1.12773672e-01, -3.90797779e-02, -5.79966642e-02,  4.97789842e-05,
        2.49000057e-03, -2.88800001e-02, -9.96003374e-02,  3.41123343e-02,
       -3.62301096e-02, -7.10571110e-02, -5.67906946e-02,  4.61289100e-03,
        7.72120059e-02, -1.36105552e-01, -6.25717789e-02, -8.04037750e-02,
        2.12122276e-02, -6.30133413e-03, -9.87700000e-02,  6.31399453e-02,
       -8.64481106e-02, -4.26407792e-02, -8.36099982e-02,  1.07030040e-02,
       -1.34339988e-01,  6.82333438e-03,  5.62012270e-02,  6.89233318e-02,
        5.61566688e-02, -9.32652280e-02,  6.18273281e-02,  1.12723336e-01,
       -1.04766667e-01, -2.15716790e-02, -1.15266666e-01,  4.57017794e-02,
        7.47987852e-02, -9.02220607e-04,  7.75654465e-02, -2.66306698e-02,
        1.93627775e-02, -4.89100069e-03, -1.43213451e-01, -6.52845576e-02,
        1.64663326e-02, -5.07618897e-02, -1.49422223e-02,  4.21274304e-02,
        1.06691113e-02, -5.97029589e-02, -1.20738111e-01, -1.61822215e-02,
       -5.95551059e-02,  3.67141105e-02,  2.88833342e-02,  5.24356700e-02,
        7.51844468e-03, -3.79579999e-02,  9.96864438e-02,  1.28289998e-01,
        1.56755541e-02, -1.55926663e-02, -4.89732213e-02,  2.24273317e-02,
       -9.15533304e-03,  7.32631087e-02, -7.48946667e-02, -1.15108885e-01,
       -5.56773357e-02, -8.49866867e-03, -3.00188921e-02,  3.55113335e-02,
       -4.22161110e-02,  7.19971135e-02,  3.67489979e-02, -1.00055551e-02,
        7.52926618e-02, -1.43726662e-01, -4.08722041e-03, -1.49663329e-01,
        1.41400262e-03,  5.52397817e-02,  8.86320025e-02, -7.44862184e-02,
       -3.23222089e-03,  3.30205560e-02,  3.77681069e-02,  6.58650026e-02,
        2.83081792e-02, -3.24210003e-02,  1.93070006e-02,  5.67157790e-02,
        6.17166609e-02,  1.09540010e-02,  4.71896678e-02,  7.68444464e-02,
       -2.51592230e-02, -4.28744499e-03, -2.40004435e-02,  3.28795537e-02,
        1.25606894e-01, -6.05716556e-02,  5.52507788e-02, -2.12161113e-02,
       -8.45399946e-02, -7.95067847e-02, -1.33965556e-02, -5.02544455e-02,
       -3.03339995e-02,  1.19719980e-02,  6.15093298e-02,  1.11455554e-02,
        1.24445252e-01,  5.54273315e-02,  1.28475904e-01, -9.19478834e-02,
       -2.29498874e-02, -4.18815538e-02,  5.02915531e-02, -1.14721097e-02,
        1.06602885e-01, -8.45602229e-02, -4.17976640e-02,  1.39088994e-02,
       -2.19033333e-03,  7.99388885e-02,  1.08606648e-02, -1.27933361e-02,
       -2.84678000e-03, -2.97433343e-02, -8.61347839e-02,  9.06177703e-03],
      dtype=float32)

但是当我运行以下命令时,我得到:

In [3]: for token in doc: print("{} : {}".format(token, token.vector[:3]))
Hello : [0. 0. 0.]
Stack : [0. 0. 0.]
Over : [0. 0. 0.]
Flow : [0. 0. 0.]
, : [-0.082752  0.67204  -0.14987 ]
my : [ 0.08649  0.14503 -0.4902 ]
name : [ 0.23231  -0.024102 -0.83964 ]
is : [-0.084961   0.502      0.0023823]
Steve : [0. 0. 0.]

请告知为什么我会得到不同的表示?

第一个向量是整个句子的表示?

请解释一下为什么我会得到不同的向量?

4

1 回答 1

0

解决方案是:实值意义表示。默认为令牌向量的平均值。

来源:https ://spacy.io/api/doc#vector

希望它也能帮助其他人。

于 2019-12-13T23:20:23.707 回答