bash - 计算存储在变量中的字段数

Question

我正在研究一个基本的文件雕刻器，我目前一直在计算文件的字节位置。

我发现我需要一段代码来执行以下步骤；

在变量中找到 $searchQuery
找到 $searchQuery 后删除字符串的其余部分
计算变量中现在存在的字段数
将此变量减去 2 以考虑十六进制偏移量和 $searchQuery 本身
然后将答案乘以 2 得到正确的字节数

这方面的一个例子是；

在“00052a0: b4f1 559c ffd8 ffe0 0010 4a46 4946 0001”中找到“ffd8”
变量更新为“00052a0: b4f1 559c ffd8”
$fieldCount 被赋值为“4”
$fieldCount=((fieldCount-2))
$byteCount=((fieldCount*2))

除了计算变量中的字段数之外，我对如何做所有事情都有一个基本的想法。例如，在找到 $searchQuery 之前，如何计算变量中有多少字段？同样，删除字符串中不必要的部分后，如何计算字段数？

用 grep 找到 $searchString 后，我不知道如何继续。我当前的代码如下所示；

#!/bin/bash
#***************************************************************
#Name:          fileCarver.sh
#Purpose:       Extracts files hidden within other files
#Author:        
#Date Written:      12/01/2013
#Last Updated:      12/01/2013
#***************************************************************

clear

#Request user input
printf "Please enter the input file name: "
read inputFile
printf "Please enter the search string: "
read searchString

#Search for the required string
searchFunction()
{
    #Search for required string and remove unnecessary characters
    startHexOffset=`xxd $1 | grep $2 | cut -d":" -f 1`
    #Convert the Hex Offset to Decimal
    startDecOffset=$(echo "ibase=16;${startHexOffset^^}" | bc)
}

searchFunction $inputFile $searchString


exit 0

谢谢您的帮助！

score 0 · Accepted Answer

如果您以更简单的格式将文件转换为十六进制，您可能会发现这更容易。例如，您可以使用命令

hexdump -v -e '/1 "%02x "' $FILE

打印文件，每个字节都转换为三个字符：两个十六进制数字和一个空格。

ffd8您可以找到所有带有字节偏移量前缀的实例：

hexdump -v -e '/1 "%02x "' $FILE | grep -Fbo 'ff d8 '

（字节偏移量需要除以 3。）

因此，您可以从第一个使用实例流式传输整个文件ffd8：

tail -c+$((
  $(hexdump -v -e '/1 "%02x "' $FILE | grep -Fbo 'ff d8 ' | head -n1 | cut -f1 -d:)
  / 3 + 1)) $FILE

（假设您用于显示文件的任何内容都知道在到达图像末尾时会停止。但您同样可以找到最后一个结束标记。）

这取决于 GNU grep；标准 Posix grep 缺少 -b 选项。但是，可以通过以下方式完成awk：

tail -c+$(
    hexdump -v -e '/1 "%02x\n"' $FILE |
    awk '/d8/&&p=="ff"{print NR-1;exit}{p=$1}'
  ) $FILE

选项说明：

tail    -c+N    file starting at byte number N (first byte is number 1)

hexdump -v      do not compress repeated lines on output
        -e 'FORMAT'  use indicated format for output:
            /1       each format consumes 1 byte
            "%02X "  output two hex digits, including leading 0, using lower case,
                     followed by a space.

grep    -F      pattern is just plain characters, not a regular expression
        -b      print the (0-based) byte offset of the... 
        -o      ... match instead of the line containing the match

cut     -f1     output the first field of each line
        -d:     fields are separated by :

score 0 · Accepted Answer

尝试：

echo "00052a0: b4f1 559c ffd8 ffe0 0010 4a46 4946 0001"| awk '
{
for (a=1;a<=NF; a++) {
    if ($a == "ffd8") {
        print substr($0,0,index($0,$a)+length($a))
        break
        }
    }
}'

输出：00052a0：b4f1 559c ffd8

bash - 计算存储在变量中的字段数

2 回答 2

Related

Reference