bash - bash：cat文件的第一行并获取位置

Question

我有一个非常大的文件，开头包含 n 行文本（n <1000），一个空行，然后是大量无类型的二进制数据。

我想提取前 n 行文本，然后以某种方式提取二进制数据的确切偏移量。

提取第一行很简单，但我怎样才能得到偏移量？bash 不知道编码，所以仅仅计算字符数是没有意义的。

score 5 · Accepted Answer

grep 有一个-b输出字节偏移量的选项。

例子：

$ hexdump -C foo 
00000000  66 6f 6f 0a 0a 62 61 72  0a                       |foo..bar.|
00000009
$ grep -b "^$" foo 
4:
$ hexdump -s 5 -C foo
00000005  62 61 72 0a                                       |bar.|
00000009

在最后一步中，我使用 5 而不是 4 来跳过换行符。

也适用于文件中的变音符号 (äöü)。

score 3 · Accepted Answer

用于grep查找空行

grep -n "^$" your_file | tr -d ':'

tail -n 1如果您想要最后一个空行（也就是说，如果文件的顶部可以在二进制文件开始之前包含空行），则可以选择使用。

用于head获取文件的顶部。

head -n $num

score 1 · Accepted Answer

您可能想使用hexdump或od之类的工具来检索二进制偏移量而不是 bash。这是一个参考。

score 1 · Accepted Answer

Perl 可以告诉你你在文件中的位置：

pos=$( perl -le '
    open $fh, "<", $ARGV[0]; 
    $/ = "";  # read the file in "paragraphs" 
    $first_paragraph = <$fh>; 
    print tell($fh)
' filename )

顺便说一句，我试图单线这个

pos=$( perl -00 -lne 'if ($. == 2) {print tell(___what?___); exit}' filename

什么是“当前文件句柄”变量？我在文档中找不到它。

bash - bash：cat文件的第一行并获取位置

4 回答 4

Related

Reference