bytearray - 数组搜索代码挑战

Question

这是我的（代码高尔夫）挑战：获取两个字节数组并确定第二个数组是否是第一个数组的子字符串。如果是，则输出第二个数组的内容出现在第一个数组中的索引。如果在第一个数组中没有找到第二个数组，则输出 -1。

示例输入：{ 63, 101, 245, 215, 0 } { 245, 215 }

预期产出：2

示例输入 2：{ 24, 55, 74, 3, 1 } { 24, 56, 74 }

预期输出 2：-1

编辑：有人指出 bool 是多余的，所以你的函数所要做的就是返回一个表示值索引的 int 或 -1 如果没有找到。

score 12 · Accepted Answer

12

常见的口齿不清：

(defun golf-code (master-seq sub-seq)
  （搜索子序列主序列））

于 2009-07-03T13:41:46.357 回答

score 8 · Accepted Answer

Ĵ

37 个字符，提供比请求更多的功能：它返回所有匹配索引的列表。

I.@(([-:#@[{.>@])"_ 0(<@}."0 _~i.@#))

用法：

   注意。给这个函数一个名字
   i =: I.@(([-:#@[{.>@])"_ 0(<@}."0 _~i.@#))
   注意。测试#1 
   245 215 i 63 101 245 215 0
2
   注意。测试 #2 - 无结果
   24 56 74 i 24 55 74 3 1

   注意。测试#3：在多个位置匹配
   1 1 i 1 1 1 2 1 1 3
0 1 4
   注意。测试#4：只有精确的子串匹配
   1 2 i 0 1 2 3 1 0 2 1 2 0
1 7

NB. list[0 to end], list[1 to end], list[2 to end], ...
<@}."0 _~i.@#

NB. Does the LHS completely match the RHS (truncated to match LHS)?
[-:#@[{.>@]

NB. boolean list of match/no match
([-:#@[{.>@])"_ 0(<@}."0 _~i.@#)

NB. indices of *true* elements
I.@(([-:#@[{.>@])"_ 0(<@}."0 _~i.@#))

score 7 · Accepted Answer

PostScript， ~~149~~ ~~146~~ ~~170~~ ~~166~~ ~~167~~ 159 个字符（在“工作”部分）：

% define data
/A [63 101 245 215 0] def
/S [245 215] def

% do the work
/d{def}def/i{ifelse}d/l S length 1 sub d/p l d[/C{dup[eq{pop -1}{dup S p
get eq{pop p 0 eq{]length}{/p p 1 sub d C}i}{p l eq{pop}if/p l d C}i}i}d
A aload pop C

% The stack now contains -1 or the position

请注意，如果子数组包含多次，则会找到子数组的最后一次出现。

修订记录：

替换falseby[[ne和trueby[[eq以保存三个字符
删除了一个错误，如果的最后一个元素S在A. 不幸的是，这个错误修正有 24 个字符。
使错误修正更便宜，节省了四个字符
必须再次插入空格，因为破折号是名称中的合法字符。这个语法错误没有被捕获，因为测试用例没有达到这一点。
停止返回布尔值，因为 OP 不再需要它们。节省 8 个字符。

解释版：

不幸的是，SO 语法高亮不知道 PostScript，所以可读性仍然有限。

/A [63 101 245 215 0] def
/S [245 215 ] def

/Slast S length 1 sub def % save the index of the last element of S,
                          % i.e. length-1
/Spos Slast def % our current position in S; this will vary
[ % put a mark on the bottom of the stack, we need this later.

/check % This function recursively removes values from the stack
       % and compares them to the values in S
{
  dup [ 
  eq
  { % we found the mark on the bottom, i.e. we have no match
    pop -1 % remove the mark and push the results
  }
  { % we're not at the mark yet
    dup % save the top value (part of the bugfix)
    S Spos get
    eq 
    {  % the top element of the stack is equal to S[Spos]
       pop % remove the saved value, we don't need it
       Spos 0
       eq 
       { % we are at the beginning of S, so the whole thing matched.
         ] length % Construct an array from the remaining values
                  % on the stack. This is the part of A before the match,
                  % so its length is equal to the position of the match.
                  % Hence we push the result and we're done.
       }
       { % we're not at the beginning of S yet, so we have to keep comparing
         /Spos Spos 1 sub def % decrease Spos
         check % recurse
       }
       ifelse
    }
    { % the top element of the stack is different from S[Spos]
      Spos Slast eq {pop} if % leave the saved top value on the stack
                             % unless we're at the end of S, because in
                             % this case, we have to compare it to the
                             % last element of S (rest of the bugfix)
      /Spos Slast def % go back to the end of S
      check % recurse
    }
    ifelse
 }
 ifelse
}
def % end of the definition of check

A aload % put the contents of A onto the stack; this will also push A again,
        % so we have to ...
pop % ...remove it again
check % And here we go!

score 6 · Accepted Answer

C99

#include <string.h>

void find_stuff(void const * const array1, const size_t array1length, /* Length in bytes, not elements */
                void const * const array2, const size_t array2length, /* Length in bytes, not elements */
                char * bReturnBool,
                int * bReturnIndex)
{
    void * found = memmem(array1, array1length, array2, array2length);
    *bReturnBool = found != NULL;
    *bReturnIndex = *bReturnBool ? found - array1 : -1;
}

简而言之，有点混乱：

#include <string.h>
#define f(a,b,c,d,e,f) { void * g = memmem(a, b, c, d); f = (e = !!g) ? g - a : -1; }

score 5 · Accepted Answer

Python 2 & 3, 73 68 58 个字符

基于~~Nikhil Chelliah的回答~~ kaiser.se的回答：

>>> t=lambda l,s:''.join(map(chr,l)).find(''.join(map(chr,s)))
>>> t([63, 101, 245, 215, 0], [245, 215])
2
>>> t([24, 55, 74, 3, 1], [24, 56, 74])
-1

Python 3, 41 36 个字符

部分感谢gnibbler：

>>> t=lambda l,s:bytes(l).find(bytes(s))
>>> t([63, 101, 245, 215, 0], [245, 215])
2
>>> t([24, 55, 74, 3, 1], [24, 56, 74])
-1

Haskell，68 64 个字符

OP 指定的参数顺序：

import List;t l s=maybe(-1)id$findIndex id$map(isPrefixOf s)$tails l

正如ehemient指出的那样，我们可以切换参数并将代码减少四个字符：

import List;t s=maybe(-1)id.findIndex id.map(isPrefixOf s).tails

score 4 · Accepted Answer

在 Python 中：

def test(large, small):
    for i in range(len(large)):
        if large[i:i+len(small)] == small:
            return i
    return -1

但是由于人们想要简洁，而不是优雅：

def f(l,s):
 for i in range(len(l)):
  if l[i:i+len(s)]==s:return i
 return -1

这是 75 个字符，包括空格。

score 4 · Accepted Answer

Ruby，使用 Array#pack（41 个字符正文）：

def bytearray_search(a,b)
  (i=b.pack('C*').index(b.pack('C*')))?i:-1
end

Perl（36 个字符正文，不包括参数处理）：

sub bytearray_search {
  ($a,$b) = @_;
  index(pack('C*',@$a),pack('C*',@$b))
}

score 3 · Accepted Answer

Python中的另一个：

def subarray(large, small):
    strsmall = ' '.join([str(c).zfill(3) for c in small])
    strlarge = ' '.join([str(c).zfill(3) for c in large])
    pos = strlarge.find(strsmall)
    return  ((pos>=0), pos//4)

score 3 · Accepted Answer

我觉得我在作弊，但是使用 Perl 可以满足 OP 的要求：

sub byte_substr {
    use bytes;
    index shift,shift
}

通常index()在 Perl 中处理具有字符语义的字符串，但是“使用字节”杂注使它使用字节段来代替。从手册页：

当“使用字节”生效时，编码暂时被忽略，每个字符串被视为一系列字节。

score 3 · Accepted Answer

红宝石 1.9 (44B)

_=->a,b{[*a.each_cons(b.size)].index(b)||-1}

p _[[63, 101, 245, 215, 0], [245, 215]]
p _[[24, 55, 74, 3, 1], [24, 56, 74]]

戈鲁比 (29B)

_=->a,b{a.e_(b.sz).dx(b)||-1}

score 2 · Accepted Answer

Python：84 个字符

def f(a,b):
 l=[a[i:i+len(b)]for i in range(len(a))]
 return b in l and l.index(b)or-1

Prolog：84 个字符（说“不”而不是返回 -1）：

s(X,[]).
s([H|T],[H|U]):-s(T,U).
f(X,Y,0):-s(X,Y).
f([_|T],Y,N):-f(T,Y,M),N is M+1.

score 2 · Accepted Answer

64个字符的Python oneliner函数定义

def f(l,s): return ''.join(map(chr,l)).find(''.join(map(chr,s)))

由于我们显式传递了一个字节数组，我们可以将其转换为 Python 的本机字节数组str并使用str.find

score 2 · Accepted Answer

Python3 36 字节

基于 Stephan202

>>> t=lambda l,s:bytes(l).find(bytes(s))
... 
>>> t([63, 101, 245, 215, 0], [245, 215])
2
>>> t([24, 55, 74, 3, 1], [24, 56, 74])
-1

score 1 · Accepted Answer

在 Python 中：

def SearchArray(input, search):
found = -1
for i in range(0, len(input) - len(search)):
    for j in range(0, len(search)):
        if input[i+j] == search[j]:
            found = i
        else:
            found = -1
            break
if  found >= 0:
    return True, found
else:
    return False, -1

去测试

print SearchArray([ 63, 101, 245, 215, 0 ], [ 245, 215 ])
print SearchArray([ 24, 55, 74, 3, 1 ], [ 24, 56, 74 ])

哪个打印：

(True, 2)
(False, -1)

请注意，有一个更短的解决方案，但它使用的 Python 语言功能并不是真正可移植的。

score 1 · Accepted Answer

在 C# 中：

private object[] test(byte[] a1, byte[] a2)
{
    string s1 = System.Text.Encoding.ASCII.GetString(a1);
    string s2 = System.Text.Encoding.ASCII.GetString(a2);
    int pos = s1.IndexOf(s2, StringComparison.Ordinal);
    return new object[] { (pos >= 0), pos };
}

使用示例：

byte[] a1 = new byte[] { 24, 55, 74, 3, 1 };
byte[] a2 = new byte[] { 24, 56, 74 };
object[] result = test(a1, a2);
Console.WriteLine("{0}, {1}", result[0], result[1]); // prints "False, -1"

score 1 · Accepted Answer

public class SubArrayMatch
{
    private bool _IsMatch;
    private int _ReturnIndex = -1;
    private List<byte> _Input;
    private List<byte> _SubArray;
    private bool _Terminate = false;
#region "Public Properties"
    public List<byte> Input {
        set { _Input = value; }
    }

    public List<byte> SubArray {
        set { _SubArray = value; }
    }

    public bool IsMatch {
        get { return _IsMatch; }
    }

    public int ReturnIndex {
        get { return _ReturnIndex; }
    }
#endregion
#region "Constructor"
    public SubArrayMatch(List<byte> parmInput, List<byte> parmSubArray)
    {
        this.Input = parmInput;
        this.SubArray = parmSubArray;
    }
#endregion
#region "Main Method"
    public void MatchSubArry()
    {
        int _MaxIndex;
        int _Index = -1;
        _MaxIndex = _Input.Count - 1;

        _IsMatch = false;

        foreach (byte itm in _Input) {
            _Index += 1;

            if (_Terminate == false) {
                if (SubMatch(_Index, _MaxIndex) == true) {
                    _ReturnIndex = _Index;
                    _IsMatch = true;
                    return;
                }
            }
            else {
                return;
            }
        }
    }

    private bool SubMatch(int BaseIndex, int MaxIndex)
    {
        int _MaxSubIndex;
        byte _cmpByte;
        int _itr = -1;

        _MaxSubIndex = _SubArray.Count - 1;
        _MaxSubIndex += 1;

        if (_MaxSubIndex > MaxIndex) {
            _Terminate = true;
            return false;
        }

        foreach (byte itm in _SubArray) {
            _itr += 1;

            _cmpByte = _Input(BaseIndex + _itr);

            if (!itm == _cmpByte) {
                return false;
            }
        }

        return true;
    }
#endregion

}

作者：Anhar Hussain Miah 编辑：Anhar.Miah @: 03/07/2009

score 1 · Accepted Answer

红宝石。不完全是世界上最短的，但很酷，因为它是 Array 的扩展。

class Array
  def contains other=[]
    index = 0
    begin
      matched = 0
      ndx = index
      while other[matched] == self[ndx]
        return index if (matched+1) == other.length
        matched += 1
        ndx += 1
      end
    end until (index+=1) == length
    -1
  end
end

puts [ 63, 101, 245, 215, 0 ].contains [245, 215]
# 2
puts [ 24, 55, 74, 3, 1 ].contains [24, 56, 74 ]
# -1

score 1 · Accepted Answer

PHP

在 105...

function a_m($h,$n){$m=strstr(join(",",$h),join(",",$n));return$m?(count($h)-substr_count($m,",")-1):-1;}

或更明确地说，

function array_match($haystack,$needle){
  $match = strstr (join(",",$haystack), join(",",$needle));
  return $match?(count($haystack)-substr_count($match,",")-1):-1;
}

score 1 · Accepted Answer

GNU C：

int memfind(const char * haystack, size_t haystack_size, const char * needle,
    size_t needle_size)
{
    const char * match = memmem(haystack, hasystack_size, needle, needle_size);
    return match ? match - haystack : -1;
}

ANSI C，没有库：

int memfind(const char * haystack, size_t haystack_size, const char * needle,
    size_t needle_size)
{
    size_t pos = 0;
    for(; pos < haystack_size; ++pos)
    {
        size_t i = 0;
        while(pos + i < haystack_size && i < needle_size &&
            haystack[pos + i] == needle[i]) ++i;

        if(i == needle_size) return pos;
    }

    return -1;
}

score 1 · Accepted Answer

C#，名为“a”和“b”的列表：

Enumerable.Range(-1, a.Count).Where(n => n == -1 
    || a.Skip(n).Take(b.Count).SequenceEqual(b)).Take(2).Last();

如果您不关心返回第一个实例，您可以这样做：

Enumerable.Range(-1, a.Count).Last(n => n == -1 
    || a.Skip(n).Take(b.Count).SequenceEqual(b));

score 1 · Accepted Answer

int m(byte[]a,int i,int y,byte[]b,int j,int z){return i<y?j<z?a[i]==b[j++]?m(a,++i,y,b,j,z):m(a,0,y,b,j,z):-1:j-y;}

Java，116 个字符。加入了一些额外的功能。好的，所以将开始条件和数组长度推送到调用者中是一个难题。像这样称呼它：

m(byte[] substring, int substart, int sublength, byte[] bigstring, int bigstart, int biglength)

score 0 · Accepted Answer

由于Fredrik已经使用 STRING 转换方式发布了代码。这是使用 C# 完成的另一种方法。

jwoolard打败了我，顺便说一句。我也使用了与他相同的算法。这是我们在大学时必须使用 C++ 解决的问题之一。

public static bool Contains(byte[] parent, byte[] child, out int index)
{
    index = -1;

    for (int i = 0; i < parent.Length - child.Length; i++)
    {
        for (int j = 0; j < child.Length; j++)
        {
            if (parent[i + j] == child[j])
                index = i;
            else
            {
                index = -1;
                break;
            }
        }
    }

    return (index >= 0);
}

score 0 · Accepted Answer

这是使用字符串比较的 C# 版本。它工作正常，但对我来说有点hacky。

int FindSubArray(byte[] super, byte[] sub)
{
    int i = BitConverter.ToString(super).IndexOf(BitConverter.ToString(sub));
    return i < 0 ? i : i / 3;
}

// 106 characters
int F(byte[]x,byte[]y){int i=BitConverter.ToString(x)
.IndexOf(BitConverter.ToString(y));return i<0?i:i/3;}

这是一个稍长的版本，它对每个单独的数组元素进行真正的比较。

int FindSubArray(byte[] super, byte[] sub)
{
    int i, j;
    for (i = super.Length - sub.Length; i >= 0; i--)
    {
        for (j = 0; j < sub.Length && super[i + j] == sub[j]; j++);
        if (j >= sub.Length) break;
    }
    return i;
}

// 135 characters
int F(byte[]x,byte[]y){int i,j;for(i=x.Length-y.Length;i>=0;i--){for
(j=0;j<y.Length&&x[i+j]==y[j];j++);if(j>=y.Length)break;}return i;}

score 0 · Accepted Answer

语言版本 v1

(defun byte-array-subseqp (subarr arr)
  (let ((found (loop 
                  for start from 0 to (- (length arr) (length subarr))
                  when (loop 
                          for item across subarr
                          for index from start below (length arr)
                          for same = (= item (aref arr index))
                          while same
                          finally (return same))
                  do (return start))))
    (values (when found t) ; "real" boolean
            (or found -1))))

Lisp v2（注意，subseq 创建一个副本

(defun byte-array-subseqp (subarr arr)
  (let* ((alength (length arr))
         (slength (length subarr))
         (found (loop 
                   for start from 0 to (- alength slength)
                   when (equalp subarr (subseq arr start (+ start slength)))
                   do (return start))))
    (values (when found t)
            (or found -1))))

score 0 · Accepted Answer

C＃：

public static object[] isSubArray(byte[] arr1, byte[] arr2) {
  int o = arr1.TakeWhile((x, i) => !arr1.Skip(i).Take(arr2.Length).SequenceEqual(arr2)).Count();
  return new object[] { o < arr1.Length, (o < arr1.Length) ? o : -1 };
}

score 0 · Accepted Answer

在红宝石中：

def subset_match(array_one, array_two)
  answer = [false, -1]
  0.upto(array_one.length - 1) do |line|
    right_hand = []
    line.upto(line + array_two.length - 1) do |inner|
      right_hand << array_one[inner]
    end
    if right_hand == array_two then answer = [true, line] end
  end
  return answer
end

示例：irb(main):151:0> subset_match([24, 55, 74, 3, 1], [24, 56, 74]) => [false, -1]

irb(main):152:0> subset_match([63, 101, 245, 215, 0], [245, 215]) => [true, 2]

score 0 · Accepted Answer

C#，适用于任何具有相等运算符的类型：

first
  .Select((index, item) => 
    first
     .Skip(index)
     .Take(second.Count())
     .SequenceEqual(second) 
    ? index : -1)
  .FirstOrDefault(i => i >= 0)
  .Select(i => i => 0 ? 
     new { Found = true, Index = i } 
    : 
     new { Found = false, Index - 1 });

score 0 · Accepted Answer

(defun golf-code (master-seq sub-seq)
  (let ((x (search sub-seq master-seq)))
    (values (not (null x)) (or x -1))))

score 0 · Accepted Answer

Haskell（114 个字符）：

import Data.List
import Data.Maybe
g a b | elem b $ subsequences a = fromJust $ elemIndex (head b) a | otherwise = -1

score 0 · Accepted Answer

Ruby，看到Lar的代码我感到很惭愧

def contains(a1, a2)
  0.upto(a1.length-a2.length) { |i| return i if a1[i, a2.length] == a2 }
  -1
end

score 0 · Accepted Answer

PHP 空气编码 285 个字符

function f($a,$b){
  if ( count($a) < 1 ) return -1;
  if ( count($b) < 1 ) return -1;
  if ( count($a) < count($b) ) return -1;

  $x = array_shift($a);
  $z = array_shift($b);

  if ($x != $z){
    $r = f( $x, array_unshift($z, $b) );
    return (-1 == $r) ? -1 : 1 + $r;
  }

  $r = f($a, $b);
  return (-1 == $r) ? -1 : 0;

}

score 0 · Accepted Answer

Golfscript - 35 个字节

如果我们计算与 J 相同，则实际函数只有32 个字节

{:b;:a,,{.a@>b,<b={}{;}if}%-1or}:f;

#Test cases (same as J)               output
[63 110 245 215 0] [245 215] f p   #  [2]
[22 55 74 3 1] [24 56 74] f p      #  -1
[1 1 1 2 1 1 3] [1 1]f p           #  [0 1 4]
[0 1 2 3 1 0 2 1 2 0] [1 2] f p    #  [1 7]

score 0 · Accepted Answer

奇怪，还没有人发布javascript..

解决方案1：

r=s=b.length;s>=0&&(r=a.indexOf(b[0]));for(x=0;x<s;)b[x]!=a[r+x++]&&(r=-1);

function f(a, b) {
    r = s = b.length;
    if (s >= 0) r = a.indexOf(b[0]);
    for (x = 0; x < s;)
    if (b[x] != a[r + x++]) r = -1;
    return r;
}

解决方案2：

r=m=-1;b.map(function(d){n=a.indexOf(d);r==m&&(c=r=n);if(n==m||c++!=n)r=m});

function f2(a, b) {
    r = m = -1;
    b.map(function (i) {
        n = a.indexOf(i);
        if (r == m) c = r = n;
        if (n == m || c++ != n) r = m;
    });
    return r;
}

score -2 · Accepted Answer

C＃

对于列表：

public static int IndexOf<T>( List<T> list1, List<T> list2 )
{
    return !list2.Except( list1 ).Any() ? list1.IndexOf( list2[0] ) : -1;
}

对于数组：

public static int IndexOf<T>( T[] arr1, T[] arr2 )
{
    return !arr2.Except( arr1 ).Any() ? Array.IndexOf( arr1, arr2[0] ) : -1;
}

bytearray - 数组搜索代码挑战

34 回答 34

Ĵ

Python 2 & 3, 73 68 58 个字符

Python 3, 41 36 个字符

Haskell，68 64 个字符

Golfscript - 35 个字节

Related

Reference