0

我有以下代码可以正常工作:

string[] userSelect = new string[] {"the", "sled", "had", "not", "moved", ";", "the", "driver", "was", "surprised", "."};
string[] original = new string[] {"the", "driver", "was", "surprised", ",", "too", ";", "the", "sled", "had", "not", "moved", "."};

var matches = 
    (from l in userSelect.Select((s, i) => new { s, i })
     join r in original.Select((s, i) => new { s, i }) 
     on l.s equals r.s 
     group l by r.i - l.i into g
     from m in g.Select((l, j) => new { l.i, j = l.i - j, k = g.Key })
     group m by new { m.j, m.k } into h
     select h.Select(t => t.i).ToArray())
    .ToArray();

// remove filter overlaps
int take = 0;
var filtered = matches.Where(m => !matches.Take(take++)
    .Any(n => m.All(i => n.Contains(i))))
    .ToArray();

使用上面我得到的结果:

{{0,1,2,3,4}, {6,7,8,9}, {5,6}, {10}}

注意 6 的重叠。因为 {"the", "driver", "was", "surprised"} 和 {";", "the"} 都在原句中。

对于这样的情况,我需要一个二级过滤器。它应该像这样找到所有值的重叠并将它们提取到独立数组中,这样就没有索引值重叠。输出应将重叠部分分开,如下所示:

{{0,1,2,3,4}, {7,8,9}, {10}, {6}, {5}}
4

0 回答 0