0

I am using OpenCalais Semantic Web service and receiving "Application/JSON" response to my submitted content. When i am looking at the Quotation entity, OpenCalais is sending the person quote but the person name is not a name of the person but a "Linked Data" URI. For example, for a person named Tayyip Erdogan:

http://d.opencalais.com/pershash-1/a7077bd6-bcc9-3419-b75e-c44e1b2eb693

I need the name of the person, not the URI. OpenCalais also send URI instead of person name in PersonCareer entity as well. I don't want to read the URI's html DOM and extract person's name as it will slow down everything. Is there a solution?

Description of Quotation Entity: http://www.opencalais.com/documentation/calais-web-service-api/api-metadata/entity-index-and-definitions#Quotation )

4

1 回答 1

0

事实证明,除了 HTML 之外,还有一种方法可以访问这些人的 URI;那是通过解析RDF。OpenCalais 提供的任何指向关联数据资源的 URI 链接也可以用作 RDF。只需将 uri 从 .html 更改为 .rdf,您就会以 RDF 格式获得该资源的所有信息。

例如,对于一个名为 Tayyip Erdogan 的人:

http://d.opencalais.com/pershash-1/a7077bd6-bcc9-3419-b75e-c44e1b2eb693.rdf

以下代码使用 file_get_dom 库,您也可以使用任何本机函数来获取文件内容。这只是我用来从 Web 服务检索到的 RDF 内容中提取人名的一种方法。我相信你能想到一个更好的解决方案。

public function get_persons_from_pershash($url)
{   
    //Gets RDF of the person URI
    @$person_html = file_get_dom($url);

    if(!empty($person_html))
    {
        //Get position of name tag and extract the name
        $strpos_start = strpos($person_html, '<c:name>') + 8;
        $strpos_end = strpos($person_html, '</c:name>');
        $str_name_length = $strpos_end - $strpos_start;
        $extracted_name = trim(substr($person_html, $strpos_start, $str_name_length));

        return $extracted_name;
    }
    return '';      
}

当您将 URL 更改为 .rdf 时,系统会提示您保存一个 rdf 文件。

我想以编程方式解析它,所以我这样做了!

希望有人觉得这有帮助!

干杯!

于 2014-07-27T01:15:31.407 回答