1

我使用 R tm 包创建了一个术语文档矩阵,并通过将其转换为数据框将其导出为 csv。

术语文档矩阵的样本部分:

        1   10  12  14  15  16  17
century 0   4   0   0   1   5   3
pete    0   2   0   6   1   0   0
additive    2   0   0   0   0   0   0
administration  1   5   3   0   3   0   0
administration  1   0   0   0   0   0   5
administrator   0   0   0   0   0   0   0
aeronautical    3   0   0   45  5   0   0
agency  0   0   5   0   0   0   0
amateur 0   0   6   0   0   0   0
anchor  5   0   1   0   0   6   0
basic   0   0   0   0   0   0   0
charles 0   0   6   0   0   0   0
commercial  0   6   0   0   0   4   0
commercial  0   0   0   0   0   2   0
commission  0   0   3   7   2   0   0
committee   0   4   0   0   1   5   3
compelling  0   2   7   6   1   0   0
construction    2   0   0   0   0   0   0
controlled  1   5   6   0   3   0   0
cooperating 1   0   0   0   0   0   5
cost    0   0   0   0   0   0   0
crewmember  3   0   0   45  0   0   0
depressed   0   0   0   0   0   0   0
developer   0   0   8   0   0   0   0
development 5   0   0   0   0   0   0
development 0   0   0   0   0   0   0
direct  0   0   0   0   0   0   0

如何将其转换为下表中包含标题和仅包含其中的术语的表格,以便在表格中进行进一步分析?

Title   term    freq
1   additive    2
1   administration  1
1   administration  1
1   aeronautical    3
1   anchor  5
1   construction    2
1   controlled  1
1   cooperating 1
1   crewmember  3
1   development 5
10  century 4
10  pete    2
10  administration  5
10  commercial  6
10  committee   4
10  compelling  2
10  controlled  5
12  administration  3
12  agency  5
12  amateur 6
12  anchor  1
12  charles 6
12  commission  3
12  compelling  7
12  controlled  6
12  developer   8
.   ... ..
.   ... ..
.   ... ..
.   ... ..
.   ... ..
4

0 回答 0