.net - 使用 WebClient 和浏览器在 DBPedia 中产生不同的结果

Question

我想提取一些存在于DBPedia中的信息。因此，我使用 .NET 的System.Net.WebClient编写了一个应用程序，它获取 url 并以N-Triples格式（纯文本）返回 url 的内容。

为url（与应用程序）提取数据的结果是：

<http://dbpedia.org/resource/AfghanistanCommunications> <http://dbpedia.org/ontology/wikiPageRedirects> <http://dbpedia.org/resource/Communications_in_Afghanistan> . <http://dbpedia.org/resource/AfghanistanCommunications>   <http://www.w3.org/ns/prov#wasDerivedFrom>  <http://en.wikipedia.org/wiki/AfghanistanCommunications?oldid=74466499> . <http://dbpedia.org/resource/AfghanistanCommunications>   <http://xmlns.com/foaf/0.1/isPrimaryTopicOf>    <http://en.wikipedia.org/wiki/AfghanistanCommunications> . <http://dbpedia.org/resource/AfghanistanCommunications>  <http://www.w3.org/2000/01/rdf-schema#label>    "AfghanistanCommunications"@en .

但是，当我使用浏览器查看url时，我得到的内容与我提取的内容截然不同。

我用 Fiddler 检查了请求，然后：

webClient.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/4.0 (兼容; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

DBPedia 是否将应用程序检测为机器人并返回的数据比真正的浏览器少，还是我错过了其他东西？！

score 1 · Accepted Answer

您的应用程序要求的当然是：

http://dbpedia.org/data/AfghanistanCommunications.ntriples

但是您的 Web 浏览器显示的是：

http://dbpedia.org/data/Communications_in_Afghanistan.ntriples

如果您的 Web 浏览器访问http://dbpedia.org/resource/AfghanistanCommunications或http://dbpedia.org/page/AfghanistanCommunications，您将被重定向到http://dbpedia.org/page/Communications_in_Afghanistan，除非要求特定格式。重定向的原因是因为维基百科有一个从http://en.wikipedia.org/wiki/AfghanistanCommunications到http://en.wikipedia.org/wiki/Communications_in_Afghanistan的重定向。您可以在应用程序中获得的三元组中看到：

<http://dbpedia.org/ontology/wikiPageRedirects>

.net - 使用 WebClient 和浏览器在 DBPedia 中产生不同的结果

1 回答 1

Related

Reference