arrays - 检查数组的每个元素是否存在于bash中的字符串中，忽略某些字符和顺序

Question

在网上，我找到了查找字符串中是否存在数组元素的答案。但我想找出数组中的每个元素是否存在于字符串中。

例如。str1 = "This_is_a_big_sentence"

最初 str2 就像

str2 = "Sentence_This_big"

现在我想搜索字符串 str1 是否包含“sentence”&“this”&“big” （全部 3，忽略字母顺序和大小写）

所以我使用arr=(${str2//_/ }) 了我现在如何进行，我知道comm命令找到了交集，但它需要一个排序列表，我也需要忽略 _ 下划线。

我str2通过使用命令查找特定类型文件的扩展名来得到我的

    for i in `ls snooze.*`; do echo    $i | cut -d "." -f2 
# Till here i get str2 and need to check as mentioned above. Not sure how to do this, i tried putting str2 as array and now just need to check if all elements of my array occur in str1 (ignore case,order)

任何帮助将不胜感激。我确实尝试使用此链接

score 2 · Accepted Answer

现在我想搜索字符串 a 是否包含“sentence”&“this”&“big”（全部 3，忽略字母顺序和大小写）

这是一种方法：

#!/bin/bash
str1="This_is_a_big_sentence"
str2="Sentence_This_big"
if ! grep -qvwFf <(sed 's/_/\n/g' <<<${str1,,}) <(sed 's/_/\n/g' <<<${str2,,})
then
    echo "All words present"
else
    echo "Some words missing"
fi

这个怎么运作

${str1,,}str1返回将所有大写字母替换为小写的字符串。
sed 's/_/\n/g' <<<${str1,,}返回字符串str1，全部转换为小写并用新行替换下划线，以便每个单词都在新行上。
<(sed 's/_/\n/g' <<<${str1,,})返回一个类似文件的对象，其中包含中的所有单词str1，每个单词小写并位于单独的行上。

类文件对象的创建称为进程替换。在这种情况下，它允许我们将 shell 命令的输出视为要读取的文件。
<(sed 's/_/\n/g' <<<${str2,,})对str2.
假设 file1 和 file2 每行都有一个单词，grep -vwFf file1 file2则从 file2 中删除 file2 中出现的每个单词。如果没有留下任何单词，这意味着 file2 中的每个单词都出现在 file1 中。

通过添加选项-q，grep将不返回任何输出，但会设置一个我们可以在if语句中使用的退出代码。

在实际命令中，file1 和 file2 被我们的类文件对象替换。

剩下的grep选项可以理解如下：
- -w告诉grep只寻找整个单词。
- -F告诉grep寻找固定的字符串，而不是正则表达式。
- -f告诉grep在后面的文件（或类似文件的对象）中寻找匹配的模式。
- -v告诉grep删除（默认是保留）匹配的单词。

score 0 · Accepted Answer

这是一种方法。

if [ "$(echo "This_BIG_senTence" | grep -ioE 'this|big|sentence' | wc -l)" == "3" ]; then echo "matched"; fi

这个怎么运作。grep 选项-i使 grep 不区分大小写，-E用于扩展正则表达式，并按-o行分隔匹配项。现在它由行分隔wc，-l用于行数。由于我们有 3 个条件，我们检查它是否等于 3。Grep 将返回匹配发生的行，所以如果您只使用字符串，上面的示例将返回每个条件的字符串，在这种情况下为 3，所以有不会有任何问题。

请注意，您还可以创建一个 grep 链并查看它是否为空。

if [ $(echo "This_BIG_SenTence" | grep -i this | grep -i big | grep -i sentence) ]; then echo matched; else echo not_matched; fi

score 0 · Accepted Answer

这是一个 awk 解决方案，用于检查另一个字符串中字符串中的所有单词是否存在：

str1="This_is_a_big_sentence"
str2="Sentence_This_big"

awk -v RS=_ 'FNR==NR{a[tolower($1)]; next} {delete a[tolower($1)]} END{print (length(a)) ? "Not all words" : "All words"}' <(echo "$str2") <(echo "$str1")

带缩进：

awk -v RS=_ 'FNR==NR {
   a[tolower($1)]; 
   next
}
{ delete a[tolower($1)] }
END {
   print (length(a)) ? "Not all words" : "All words"
}' <(echo "$str2") <(echo "$str1")

解释：

-v RS=_我们使用记录分隔符作为_
FNR==NR- 执行此块str2
a[tolower($1)]; nexta-用每个小写单词作为键填充一个数组
{delete a[tolower($1)]}str1- 对于数组中删除键中的每个单词a
END- 如果数组的长度a仍然不是 0，那么还有一些单词。

score 0 · Accepted Answer

这是另一个解决方案：

#!/bin/bash
str1="This_is_a_big_sentence"
str2="sentence_This_big"
var=0
var2=0

while read in
do
        if [  $(echo $str1 | grep -ioE $in) ]
        then
                var=$((var+1))
        fi
        var2=$((var2+1))
done < <(echo $str2 | sed -e 's/\(.*\)/\L\1/' -e 's/_/\n/g')

if [[ $var -eq $var2 && $var -ne 0 ]]
then
        echo "matched"
else
        echo "not matched"

这个脚本的作用是使str2全部小写，sed -e 's/$.*$/\L\1/'用它的小写替换任何字符，然后用下面的 sed 表达式替换下划线：_，这是另一个替换。\nsed -e 's/_/\n/g'

现在，单个单词被输入到一个 while 循环中，该循环将str1与输入的单词进行比较。每次匹配时，递增var，每次迭代 while 时，我们递增var2。如果，则str2var == var2的所有单词都在str1中找到。希望有帮助。

score 0 · Accepted Answer

现在我知道你的意思了。试试这个：

#!/bin/bash

# add 4 non-matching examples
> snooze.foo_bar
> snooze.bar_go
> snooze.go_foo
> snooze.no_match

# add 3 matching examples
> snooze.foo_bar_go
> snooze.goXX_XXfoo_XXbarXX
> snooze.bar_go_foo_Ok

str1=("foo" "bar" "go")
for i in `ls snooze.*`; do
    str2=${i#snooze.}
    j=0
    found=1
    while [[ $j -lt ${#str1[@]} ]]; do
       if ! echo $str2 | eval grep \${str1[$j]} >& /dev/null; then
           found=0
           break
       fi
       ((j++))
    done
    if [[ $found -ne 0 ]]; then
        echo Match found: $str2
    fi
done

此脚本的结果打印：

Match found: bar_go_foo_Ok
Match found: foo_bar_go
Match found: goXX_XXfoo_XXbarXX

或者，上面的 if..grep 行可以替换为

if [[ ! $str2 =~  `eval echo \${str1[$j]}` ]]; then

利用 bash 的正则表达式匹配。

注意：我不太注意搜索字符串中的特殊字符，例如“\”或“”（空格），这可能会导致问题。

--- 一些解释 ---

在 if .. grep 行中，首先计算 $j 到运行索引，从 0 到 $str1 中的元素数减 1。然后，eval 将再次重新计算整个grep命令，导致 ${str1[jjj] } 被重新评估（这里，jjj 是已经评估的索引）

策略是设置found=1（默认为found），然后当任何grep失败时，我们将found设置为0并打破内部j-loop。

其他一切都应该直截了当。

arrays - 检查数组的每个元素是否存在于bash中的字符串中，忽略某些字符和顺序

5 回答 5

这个怎么运作

Related

Reference