1

我想建立一个算法,它能够给出两个数组中元素的重叠索引。

例如我有两个字符串数组

Array1: {"a","c","h", "d","a","o","m" }

Array2: { "d","a","o","m" ,"c","e","o","m","c","z","e","l" ,"p","v","c","z","c"}

它应该以这种方式返回 array1 和 array2 的索引

x1,y1={3,6} *x2,y2={0,3}*

这意味着数组中字符串的序列应该匹配,只要字符串值匹配,我们就需要这样做,并确保每条记录在它们之前的记录匹配中都是唯一的。

等待答案和回复,如果有问题,请告诉我。

举个例子,我在数据库中有一个表,我在其中插入记录,并且我要插入的每条记录都应该是非常独特的。所以在插入时,我们确实有一个需要批量插入的记录数组。因此,如果我说,我确实有数据库表中的记录格式为:

1 哈希 1 哈希 2 哈希 3 哈希 4 哈希 5

我想在数据库中插入一堆记录,格式为: hash3 hash4 hash5 hash6 hash7 hash8

那么表格的结果应该看起来像这样 colum1 hash1 hash2 hash3 hash4 hash5 hash6 hash7 hash8

但是,如果需要插入数据库的数组的格式为 hash2 hash3 hash4 hash6

那么它应该被视为一个全新的条目,并将作为一个整体插入到数据库中。

希望这次我清楚阐述我的问题

4

1 回答 1

2

The problem you're referring to is called the Longest Common Substring problem (not to be confused with another problem about strings and an LCS acronym - longest common subsequence). As usual, the best explanation is in Wikipedia: http://en.wikipedia.org/wiki/Longest_common_substring_problem :)

In short - if the strings are really short and you're a very lazy programmer, the fastest way of doing that is to check all substrings of arr2 against arr1. This will take about n^2*m time (if arr2 if of length n, and arr1 is of length m) - which is a lot of time for long strings.

If your strings are longer / you are less lazy, the best algorithm using suffix trees will give you linear running time.

于 2012-05-01T14:52:00.950 回答