ruby - 为什么我会收到“语法错误，意外 _end；” 和“意外的输入结束”？

Question

当我运行我的代码时，我会收到非常奇怪的错误消息：

/Users/Pan/Data/external/filter_url_1008.rb:35：语法错误，意外的keyword_end
/Users/Pan/Data/external/filter_url_1008.rb:45：语法错误，意外的输入结束，期待关键字结束
filter_file.close
                 ^

我检查了我的 Ruby 代码几次，但无法找出问题所在。

#This script is for filterring any html files that doesn't abide the rule.
require "fileutils"

#path where html files will be read from
source_dir = "/20131008" 

#path where flittered html files will be copy to
dest_dir ="/20131008_filtered"

#file index to be filtered
filter_file = File.open("filtered_index.txt","r")

if !File.exist?(dest_dir) 
    FileUtils.mkdir_p("/dest_dir")
    print(dest_dir + " was created!\n") 
end

#filter rule
blacklist = ["facebook.com", "youtube.com", "twitter.com",
"linkedin.com", "bebo.com", "twitlonger.com", "bing.com", "ebay.com",
"ebayrt.com", "maps.google", "waze.com", "foursquare.com", "adf.ly", 
"twitpic.com","itunes.apple.com","craigslist.org","instagram.com", 
"google.com", "google.co.uk", "google.ie","bullhornreach", 
"pinterest.com", "feedsportal","tumblr.com"]

filter = filter_file.read

#Read from 20131008_filtered.txt and exclude urls that's in blacklist
filter.each_line do |line|
    $match_count = 0

    blacklist.each do |blacklist_atom|
        if !(line.downcase.include? "blacklist_atom")
            match_count += 1
        end
    end

    if (blacklist.length == match_count)
        filename_cp = line[line.index("20131008/") + 9..line.index(".html") - 1]
        filename = filename_cp.to_s + ".html"
        FileUtils.cp(source_dir + "/" + filename, dest_dir)
    end
end

filter_file.close

score 3 · Accepted Answer

您不能++在 Ruby 中使用运算符。改为使用match_count += 1。

编辑

它们不是“真正奇怪的错误消息”，它只是一个表示语法错误的消息：程序甚至还没有开始被解释，这是一个运行前检查。

score 0 · Accepted Answer

你做错了几件事。这不是试图重写您的代码，因此它是无错误的，而是展示如何以更易于维护的风格编写代码并且更接近 Ruby 方式：

require 'fileutils'

SOURCE_DIR = '/20131008' 
DEST_DIR ='/20131008_filtered'

BLACKLIST = %w[
  adf.ly
  bebo.com
  bing.com
  bullhornreach
  craigslist.org
  ebay.com
  ebayrt.com
  facebook.com
  feedsportal
  foursquare.com
  google.co.uk
  google.com
  google.ie
  instagram.com
  itunes.apple.com
  linkedin.com
  maps.google
  pinterest.com
  tumblr.com
  twitlonger.com
  twitpic.com
  twitter.com
  waze.com
  youtube.com
]

unless File.exist?(DEST_DIR) 
  FileUtils.mkdir_p(DEST_DIR)
  print(DEST_DIR + " was created!\n") 
end

File.foreach("filtered_index.txt") do |line|
  # $match_count = 0
  match_count = 0

  BLACKLIST.each do |blacklist_atom|
    match_count += 1 unless (line.downcase[blacklist_atom])
  end

  if (BLACKLIST.length == match_count)
    FileUtils.cp(
      File.join(
        SOURCE_DIR,
        File.basename(
          line,
          File.extname(line)
        ) + '.html'
      ),
      DEST_DIR
    )
  end
end

怎么了：

使用常量，并将它们移动到易于查看/编辑的文件顶部。
像网站名称一样对列表进行排序，以便更轻松地编辑/扩展列表，并查看您是否已有条目。
不要打开文件，做一堆东西，然后将其全部读入内存，做更多的东西，拆分它，然后遍历行，然后做一堆东西，然后关闭它。相反，使用更智能的文件方法，例如File.foreach并使用一个块来打开然后自动关闭文件。将文件完全读入内存是一个非常糟糕的习惯，因为它根本不可扩展。想象一下如果文件大于程序的可用内存会发生什么。
您没有将变量插入到字符串中，而且您正在添加一个额外的前导路径分隔符：
```
FileUtils.mkdir_p("/dest_dir")
```
不要在块中使用“$global”变量。这表明对变量范围缺乏了解。
全局变量$match_count只被初始化而不被读取。

您可以更简洁地编写“非子字符串搜索”：

if !(line.downcase.include? "blacklist_atom")

使用类似的东西：

unless line.downcase[blacklist_atom]

使用内置功能：
```
filename_cp = line[line.index("20131008/") + 9..line.index(".html") - 1]
```
相反，请使用它，File.join(...)因为它知道您的操作系统需要什么路径分隔符。使用File.basename(...)，因为它使用相同的路径分隔符来提取文件。

score 0 · Accepted Answer

删除thenif 语句行上的？它可能是有效的，但它肯定不是常见的用法。

Ruby 也没有++运算符。

ruby - 为什么我会收到“语法错误，意外 _end；” 和“意外的输入结束”？

3 回答 3

编辑

Related

Reference