0

我正在尝试在一些格式化文本中获取内容。例子

在文件中输入:

i would like to say ("hi")

i am leaving, ("bye")

who is there? ("crazy cat")

I have a ("dirty dog that needs water")


//

我如何只抓取(“”)中的字符串。

我试图通过空格或具有 (" 的字符串来解析它,但我无法获取带空格的字符串...

目前我正在使用

 cat get_list.txt | tr ' ' '\n'
4

3 回答 3

1
grep -o -E '\(\".*\"\)' get_list.txt

Should do it if you want to include the (" and the ")

If you don't want those, then you need the following:

sed 's/^.*(\"\(.*\)\").*$/\1/' get_list.txt

Explanation:

s/       substitute
^.*(\"   all characters from the start of the string until a (" (the " is escaped)
\(.*\)   keep the next bit in a buffer - this is the match I care about
\")      this signals that the bit I'm interested in is over
.*$      then match to the end of the line
/\1/     replace all of that with the bit I was interested in

(Note - I changed the grep and sed command in response to valid comments that a pipe wasn't necessary).

于 2013-03-11T21:17:06.917 回答
1

尝试使用环视正则表达式技术来执行此操作:

$ grep -oP '\("\K[^"]+(?="\))' file.txt
bye
crazy cat
dirty dog that needs water

或者使用仍然使用环视正则表达式技术的便携式解决方案:

perl -lne 'print $& if /\("\K[^"]+(?="\))/' file.txt

或者简单地说:

cut -d'"' -f2 file.txt
于 2013-03-11T21:21:07.840 回答
0

如果您只想要双引号之间的文本(没有引号本身),您可以使用awk

awk -F\" '{print $2}' get_list.txt
于 2013-03-11T21:21:39.723 回答