ruby - RDF::Reader > URI::InvalidError 的问题

Question

我对这段代码有疑问：

require 'rubygems'
require 'rdf'
require 'rdf/raptor'

RDF::Reader.open("http://reegle.info/countries/IN.rdf") do |reader|
  reader.each_statement do |statement|
    puts statement.inspect
  end
end

当试图打开上述 url 时，我被重定向到一个 URI.parse 显然不喜欢的 url：

http://sparql.reegle.info?query=CONSTRUCT+{+%3Chttp://reegle.info/countries/IN%3E+?p+?o.+%3Chttp://reegle.info/countries/IN.rdf%3E+foaf:primaryTopic+%3Chttp://reegle.info/countries/IN%3E;+cc:license+%3Chttp://www.nationalarchives.gov.uk/doc/open-government-licence%3E;+cc:attributionName+"REEEP";+cc:attributionURL+%3Chttp://reegle.info/countries/IN%3E.+}+WHERE+{+%3Chttp://reegle.info/countries/IN%3E+?p+?o.}&format=application/rdf%2Bxml

所以我收到以下错误：

URI::InvalidURIError: bad URI(is not URI?)

任何想法，如何解决这个问题？

谢谢

PS 执行类似 URI.parse(URI.encode([url]))) 的操作在这里没有任何效果。

score 1 · Accepted Answer

URI不喜欢该 URL 中的双引号或大括号。您可以使用以下方法手动修复 URI：

# This auto-populating cache isn't necessary but...
replacements = Hash.new { |h,k| h[k] = URI.encode(k) }
broken_uri.gsub!(/[{}"]/) { replacements[$&] }

来自RFC 1738：统一资源定位符 (URL)：

因此，只有字母数字、特殊字符“ $-_.+!*'(),”和用于其保留目的的保留字符可以在 URL 中未编码地使用。

所以我会说 reegle.info 应该比它们编码更多的东西。OTOH，Ruby 的 URI 类可能更宽容一些（例如，Perl 的URI 类将接受该 URI 作为输入，但它会将双引号和大括号转换为输出时的百分比编码形式）。

ruby - RDF::Reader > URI::InvalidError 的问题

1 回答 1

Related

Reference