我目前pywikibot
用于获取给定维基百科页面的类别(例如,support-vector machine
),如下所示。
import pywikibot as pw
print([i.title() for i in list(pw.Page(pw.Site('en'), 'support-vector machine').categories())])
我得到的结果是:
[
'Category:All articles with specifically marked weasel-worded phrases',
'Category:All articles with unsourced statements',
'Category:Articles with specifically marked weasel-worded phrases from May 2018',
'Category:Articles with unsourced statements from June 2013',
'Category:Articles with unsourced statements from March 2017',
'Category:Articles with unsourced statements from March 2018',
'Category:CS1 maint: Uses editors parameter',
'Category:Classification algorithms',
'Category:Statistical classification',
'Category:Support vector machines',
'Category:Wikipedia articles needing clarification from November 2017',
'Category:Wikipedia articles with BNF identifiers',
'Category:Wikipedia articles with GND identifiers',
'Category:Wikipedia articles with LCCN identifiers'
]
如您所见,我得到的结果包括许多维基百科的跟踪和维护类别,例如;
- 类别:所有带有明确标记的黄鼠狼短语的文章
- 分类:所有带有非来源陈述的文章
- 类别:CS1 maint:使用编辑器参数
- 等等
但是,我只感兴趣的类别是;
- 类别:分类算法
- 分类:统计分类
- 分类:支持向量机
我想知道是否有办法获取所有tracing or maintenance
维基百科类别,以便我可以将它们从结果中删除以仅获取信息类别。
或者,如果有任何其他方法可以从结果中消除它们,请建议我。
如果需要,我很乐意提供更多详细信息。