0

这是我的代码..

require "open-uri"

base_url = "http://en.wikipedia.org/wiki"

(1..5).each do |x|
  # sets up the url
  full_url = base_url + "/" + x.to_s
  # reads the url
  read_page = open(full_url).read
  # saves the contents to a file and closes it
  local_file = "my_copy_of-" + x.to_s + ".html"
  file = open(local_file,"w")
  file.write(read_page)
  file.close

  # open a file to store all entrys in

  combined_numbers = open("numbers.html", "w")

  entrys = open(local_file, "r")

  combined_numbers.write(entrys.read)

  entrys.close
  combined_numbers.close

end

如你看到的。它基本上会抓取维基百科文章 1 到 5 的内容,然后尝试将它们组合成一个名为 numbers.html 的文件。

它首先做对了。但是当它到达第二个时。好像只是在循环中写第五条的内容。

我看不出我哪里出错了。有什么帮助吗?

4

1 回答 1

2

打开摘要文件时选择了错误的模式。“w”覆盖现有文件,而“a”附加到现有文件

所以用它来让你的代码工作:

combined_numbers = open("numbers.html", "a")

否则,每次循环时,numbers.html的文件内容都会被当前文章覆盖。


此外,我认为您应该使用其中的内容read_page来写入,numbers.html而不是从新写入的文件中读回它们:

require "open-uri"

(1..5).each do |x|
  # set up and read url
  url = "http://en.wikipedia.org/wiki/#{x.to_s}"
  article = open(url).read

  # saves current article to a file
  # (only possible with 1.9.x use open too if on 1.8.x)
  IO.write("my_copy_of-#{x.to_s}.html", article)

  # add current article to summary file
  open("numbers.html", "a") do |f|
    f.write(article)
  end
end
于 2012-02-12T21:50:43.200 回答