1

任何人都知道我将如何查找和替换字符串中的文本?基本上我有两个字符串:

string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDABQODxIPDRQSERIXFhQYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f3//";

string secondS = "abcdefg2wBDABQODxIPDRQSERIXFh/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/abcdefg";

我想搜索firstS它是否包含任何字符序列,secondS然后替换它。它还需要替换为方括号中的替换字符数:

[已替换字符数]

例如,因为firstSsecondS都包含“2wBDABQODxIPDRQSERIXFh”和“/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/”,所以需要替换它们. 于是就firstS变成了:

string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/[22]QYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39[61]f3//";

希望这是有道理的。我想我可以用 Regex 做到这一点,但我不喜欢它的低效率。有谁知道另一种更快的方法?

4

4 回答 4

3

有谁知道另一种更快的方法?

是的,这个问题实际上有一个专有名称。它被称为最长公共子串,它有一个相当快的解决方案

这是ideone 的一个实现。它查找并替换十个字符或更长的所有常见子字符串。

// This comes straight from Wikipedia article linked above:
private static string FindLcs(string s, string t) {
    var L = new int[s.Length, t.Length];
    var z = 0;
    var ret = new StringBuilder();
    for (var i = 0 ; i != s.Length ; i++) {
        for (var j = 0 ; j != t.Length ; j++) {
            if (s[i] == t[j]) {
                if (i == 0 || j == 0) {
                    L[i,j] = 1;
                } else {
                    L[i,j] = L[i-1,j-1] + 1;
                }
                if (L[i,j] > z) {
                    z = L[i,j];
                    ret = new StringBuilder();
                }
                if (L[i,j] == z) {
                    ret.Append(s.Substring( i-z+1, z));
                }
            } else {
                L[i,j]=0;
            }
        }
    }
    return ret.ToString();
}
// With the LCS in hand, building the answer is easy
public static string CutLcs(string s, string t) {
    for (;;) {
        var lcs = FindLcs(s, t);
        if (lcs.Length < 10) break;
        s = s.Replace(lcs, string.Format("[{0}]", lcs.Length));
    }
    return s;
}
于 2012-08-29T00:51:44.577 回答
0

我有一个类似的问题,但是对于单词出现!所以,我希望这会有所帮助。我使用SortedDictionary了一个二叉搜索树

/* Application counts the number of occurrences of each word in a string
   and stores them in a generic sorted dictionary. */
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;

public class SortedDictionaryTest
{
   public static void Main( string[] args )
   {
      // create sorted dictionary
      SortedDictionary< string, int > dictionary = CollectWords();

      // display sorted dictionary content
      DisplayDictionary( dictionary );
   } 

   // create sorted dictionary 
   private static SortedDictionary< string, int > CollectWords()
   {
      // create a new sorted dictionary
      SortedDictionary< string, int > dictionary =
         new SortedDictionary< string, int >();

      Console.WriteLine( "Enter a string: " ); // prompt for user input
      string input = Console.ReadLine(); 

      // split input text into tokens
      string[] words = Regex.Split( input, @"\s+" );

      // processing input words
      foreach ( var word in words )
      {
         string wordKey = word.ToLower(); // get word in lowercase

         // if the dictionary contains the word
         if ( dictionary.ContainsKey( wordKey ) )
         {
            ++dictionary[ wordKey ];
         } 
         else
            // add new word with a count of 1 to the dictionary
            dictionary.Add( wordKey, 1 );
      } 

      return dictionary;
   } 

   // display dictionary content
   private static void DisplayDictionary< K, V >(
      SortedDictionary< K, V > dictionary )
   {
      Console.WriteLine( "\nSorted dictionary contains:\n{0,-12}{1,-12}",
         "Key:", "Value:" );

      /* generate output for each key in the sorted dictionary
        by iterating through the Keys property with a foreach statement*/
      foreach ( K key in dictionary.Keys )
         Console.WriteLine( "{0,- 12}{1,-12}", key, dictionary[ key ] );

      Console.WriteLine( "\nsize: {0}", dictionary.Count );
   } 
} 
于 2012-08-29T01:28:56.063 回答
0

这可能很慢,但是如果您愿意承担一些技术债务并且现在需要一些东西来进行原型设计,那么您可以使用 LINQ。

string firstS = "123abc";
string secondS = "456cdeabc123";
int minLength = 3;

var result = 
    from subStrCount in Enumerable.Range(0, firstS.Length)
    where firstS.Length - subStrCount >= 3
    let subStr = firstS.Substring(subStrCount, 3)
    where secondS.Contains(subStr)
    select secondS.Replace(subStr, "[" + subStr.Length + "]");

结果是

 456cdeabc[3] 
 456cde[3]123 
于 2012-08-29T03:36:26.690 回答