0

I want to read wikipedia page through a parser for example JWPL. I am able to do it, but my problem is:

I want to count the chars between the headings and Sections, and the number of links.

Using JWPL, I could get a list of the sections inside each link from the list, but I am not able to count the chars.

Overall, my aim is to read a wikipedia page, convert its data model to my datamodel and give out another file which contains my data model.

My data model is a file which would contain: section names, numbers "the count of chars between the section and the next link or other section.

Thanks for help.

4

1 回答 1

0

有一个更好的方法是使用维基百科中当前可用的服务。您可以使用一组 GET 请求与之交互阅读维基百科的元数据页面 http://en.wikipedia.org/wiki/Wikipedia:Metadata

mediawiki 还对这种交互进行了一些解释 http://www.mediawiki.org/wiki/API:Main_page

祝你好运

于 2012-07-12T09:23:36.860 回答