2

我有一个 txt 文件,每行都有一系列字符串。我需要查找给定的字符串,将该字符串移动到另一个文件并从文件中删除该行。

移动到另一个文件正在工作,这是代码。

File.open('file_moved.txt', 'w') { |file| file.puts pick_random_line.to_i.to_s }

def pick_random_line
  chosen_line = nil
  File.foreach("file.txt").each_with_index do |line, number|
  chosen_line = line if rand < 1.0/(number+1)
  end
  chosen_line
end

我对如何从另一个文件中删除该行有点迷茫。Ruby 中删除匹配字符串的完整行的方法是什么?

4

2 回答 2

4

What about something like this?

lines = File.readlines('file.txt')

random_line = lines.shuffle.pop

File.open('file.txt', 'w') do |f|
  f.write(lines.join(''))
end

File.open('random.txt', 'a') do |f|
  f.write(random_line)
end

Note that readlines has the effect of reading the whole file into memory, but it also means you get a truly random sample from the file. Your implementation is probably biased more heavily toward the end of the file since you do not know how many lines there are in advance.

As with anything that does manipulation in this way, there is a small chance that the file might be truncated if this program is halted unexpectedly. The usual method to avoid this is to write to a temporary file, then rename when that's successful. A better alternative is to use a database, even an embedded one like SQLite.

于 2013-06-08T19:39:24.827 回答
3

Removing any bytes or substring from a file essentially means you must re-write the file from that point onwards at a minimum. Some specialist file system could maybe exist where that is not true, but most general-purpose file systems won't allow removing bytes from the middle of a file cheaply. Probably the closest you get to the "apply this change: delete these lines" type of control is a version management system like git.

That's really just philosophy as far as your problem is concerned though - if your output must be another text file with the line removed, then you simply generate two files:

  • The new file with extracted data

  • The altered original file with data removed (written back over the top of the original)

There are options for how you deal with the original file:

  • Read all the data in, adjust in memory, and over-write the original. This is simplest, but doesn't scale to large files.

  • Read data line by line, writing each line out immediately to either a temporary altered file, or the new file. At the end of the process, delete the original old file, and move the temporary altered one into its place. This is a little more complex, but can handle larger files.

于 2013-06-08T19:40:02.410 回答