-1

我有一个包含内容的 foo.txt 文件

'w3ll' 'i' '4m' 'n0t' '4sed' 't0' 

'it'

我正在尝试提取其中包含 2 个字符的所有单词。我的意思是,输出文件应该只有

4m
t0
it

我尝试的是,

with open("foo.txt" , 'r') as foo:
    listme = foo.read()

string =  listme.strip().split("'")

我想这将用 ' 符号分割字符串。如何仅选择那些字符数等于 2 的撇号中的那些字符串?

4

4 回答 4

1

这应该有效:

>>> with open('abc') as f, open('output.txt', 'w') as f2:
...     for line in f:
...         for word in line.split():    #split the line at whitespaces
...             word = word.strip("'")   # strip out `'` from each word
...             if len(word) == 2:       #if len(word) is 2 then write it to file
...                 f2.write(word + '\n')

print open('output.txt').read()
4m
t0
it

使用regex

>>> import re
>>> with open('abc') as f, open('output.txt', 'w') as f2:
    for line in f:
        words = re.findall(r"'(.{2})'",line)
        for word in words:
            f2.write(word + '\n')
...             
>>> print open('output.txt').read()
4m
t0
it
于 2013-07-01T13:49:34.700 回答
1

假设您要查找''符号中包含的所有单词,它们正好是两个字符长:

import re
split = re.compile(r"'\w{2}'")

with open("file2","w") as fw:
    for word in split.findall(open("file","r").read()):
            fw.write(word.strip("'")+"\n")
于 2013-07-01T14:48:41.850 回答
0

由于您正在阅读以空格(或逗号)分隔的引用单词,因此您可以使用 csv 模块:

import csv

with open('/tmp/2let.txt','r') as fin, open('/tmp/out.txt','w') as fout:
    reader=csv.reader(fin,delimiter=' ',quotechar="'")
    source=(e for line in reader for e in line)             
    for word in source:
        if len(word)<=2:
            print(word)
            fout.write(word+'\n')

'out.txt':

i
4m
t0
于 2013-07-01T15:07:29.863 回答
0
with open("foo.txt" , 'r') as file:
  words = [word.strip("'") for line in file for word in line.split() if len(word) == 4]

with open("out", "w") as out:
  out.write('\n'.join(words) + '\n')
于 2013-07-01T14:05:24.280 回答