0

I am trying to do something very simple, but solving it the way I want would help me with many other commands as well.

I want to read a file line by line in UNIX and perform commands on them, in this case character count. For an entire file, I would just use:

wc -m

However, I want this per line of input. What is the simplest, shortest way to stream a file line by line for manipulation by UNIX commands? I ask because in this situation I want wc -m per line, but future applications will use completely different commands.

Also, I want to avoid perl and awk! I already know how to do this with those tools, but am looking for alternate methods.

Thanks!

EDIT: Thanks for the link to the other question, but after looking at their 4 answers, I don't see a solution to my exact quandary.

Given the following input:

cat test.txt
    This is the first line.
    This is the second, longer line.
    This is short.
    My Final line that is much longer than the first couple of lines.

I want to plug it through some code that will read it line by line and perform a command on each line, immediately returning the result.

Some code which does wc -m on each line and returns the output:

23
32
14
65

Or some code which does cut -d " " -f 1 on each line and returns the output:

This
This
This
My

Hopefully this makes things a bit clearer. Thanks again for any suggestions!

4

2 回答 2

5

您可以使用echo "${#line}"来了解字符串的长度。用 a 读取文件while read...将完成剩下的工作:

$ cat file
hello
my name
is fedor
qui


$ while read line; do echo "${#line}"; done < file
5
7
8
3
0

以更好的格式:

while read line
do
   echo "${#line}"
done < file
于 2013-06-21T15:43:58.963 回答
1

逐行处理的最佳选择是while read循环,尽管用于准确保留行的习惯用法是:

while IFS= read -r line; do
    # process "$line"
done

不使用IFS=会丢失前导空格。未能使用read -r意味着某些反斜杠序列将被 bash 解释,并且不会逐字保留在变量中。

我认为您的 qudry 可以重述:

我有一行文字。我如何将其视为文件?

bash 有 2 个功能可以回答这个问题

  1. 对于可以从标准输入读取的命令wc,请使用here-string

    wc -m <<< "$line"
    
  2. 对于需要文件的命令(我想不出一个),使用进程替换

    wc -m <(echo "$line")
    

例子:

$ line="foo bar baz"
$ wc -m <<<"$line"
12
$ wc -m <(echo "$line")
12 /dev/fd/63

ps 我注意到字符数包括隐式尾随换行符。要删除它,请在格式字符串中使用不带换行符的 printf

$ wc -m <(printf %s "$line")
11 /dev/fd/63
$ wc -m < <(printf %s "$line")
11
于 2013-06-21T18:18:19.100 回答