0

我尝试在 R 中使用 LIWC ditonary 2015 版本。

用于文本分析的虚拟文本:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus. Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem. Maecenas nec odio et ante tincidunt tempus. Donec vitae sapien ut libero venenatis faucibus. Nullam quis ante. Etiam sit amet orci eget eros faucibus tincidunt. Duis leo. Sed fringilla mauris sit amet nibh. Donec sodales sagittis magna. Sed consequat, leo eget bibendum sodales, augue velit cursus nunc

我试试这条线:

library("LIWCalike")
library("quanteda")
 liwcalike(data_char_testphrases)
liwc2015dict <- dictionary(file = "~/Dropbox/QUANTESS/dictionaries/LIWC/LIWC2015_English_Flat.dic",
'                            format = "LIWC")
' inaugLIWCanalysis <- liwcalike(data_corpus_inaugural, liwc2015dict)
' inaugLIWCanalysis[1:6, 1:10]

我希望得到如下结果,这些结果可以在官方网站上复制为简单的示例,当然我相信 LIWC 有更多的变量这些是一些示例

LIWC Dimension  Your
Data    Personal
Texts   Formal
Texts
Self-references (I, me, my) 5.18    11.4    4.2
Social words    2.59    9.5 8.0
Positive emotions   2.35    2.7 2.6
Negative emotions   1.18    2.6 1.6
Overall cognitive words 6.59    7.8 5.4
Articles (a, an, the)   8.71    5.0 7.2
Big words (> 6 letters) 20.24   13.1    19.6

但我收到了这个结果:

output[, c(1:7, ncol(output)-2)]
#>    docname Segment WC WPS Sixltr   Dic LINGUISTIC PROCESSES.FUNCTION WORDS
#> 1    text1       1  8   3  37.50 37.50                               25.00
#> 2    text2       2  6   5  16.67 50.00                               50.00
#> 3    text3       3  4   2   0.00 25.00                                0.00
#> 4    text4       4 18  12  11.11 61.11                               22.22
#> 5    text5       5  4   1   0.00 25.00                                0.00
#> 6    text6       6  7   3  14.29 28.57                               14.29
#> 7    text7       7  7   3   0.00 42.86                               28.57
#> 8    text8       8  5   4   0.00 80.00                               60.00
#> 9    text9       9  9   2  11.11 11.11                               11.11
#> 10  text10      10  9   2  22.22 22.22                               22.22
#>    Apostro
#> 1        0
#> 2        0
#> 3        0
#> 4        0
#> 5        0
#> 6        0
#> 7        0
#> 8        0
#> 9        0
#> 10       0

我怎样才能得到与 LIWC 示例试用站点版本中的结果一样的结果?

4

1 回答 1

0

请参阅此页面,了解如何获得与 LIWC 几乎相同的结果:https ://koheiw.net/?p=573

于 2017-12-04T11:09:54.603 回答