3

I have a c program which does the string compression successfully in c language and it's a brute force approach there in C language.for example if input is aabccccdddddddddddaa then output should be a2b1c3d11a2.

I solved this in c language by taking each character and counting number of its occurrences and then printing that character and it's count.

I am trying to convert this to c# language. I am wondering it should be easy to do in the c# language because of so many string and char built in methods.

Is there a way we can do this in c# by using lambda expressions or built in methods of string or char type in very few lines?

My C code is :

        char *encode(char *src)
    {     
          int recurringLen;
          char count[MAX_RLEN];
          char *dest;
          int i, j = 0, k;
          int len = strlen(src);

         // _itoa_s(34,c,10);
          /* If all characters in the source string are different, 
            then size of destination string would be twice of input string.
            For example if the src is "abcd", then dest would be "a1b1c1d1"
            For other inputs, size would be less than twice. 
            test for the scenarios where abababababababababababa bcos output here is a16b11.
            aabbbcccd
            */
           dest = (char *)malloc(sizeof(char)*(len*2 + 1));


          /* traverse the input string one by one */
          for(i = 0; i < len; i++)
          {

            /* Copy the first occurrence of the new character */
            dest[j++] = src[i];

            /* Count the number of occurrences of the new character */
            recurringLen = 1;     
            while(i + 1 < len && src[i] == src[i+1])
            {
              recurringLen++;
              i++;
            }   

            /* Store rLen in a character array count[] */
            sprintf_s(count, "%d", recurringLen);

            /* Copy the count[] to destination */
            for(k = 0; *(count+k); k++, j++)
            { 
              dest[j] = count[k]; 
            } 
          }  

          /*terminate the destination string */
          dest[j] = '\0';
          return dest;
    }     
4

2 回答 2

8

可以通过编写扩展方法来完成 Linqy 方式GroupSeqsBy

string input= "aabccccdddddddddddaa";
var s = String.Join("",input.GroupSeqsBy(c => c)
                            .Select(g => g.Key.ToString() + g.Value.Count()));

public static IEnumerable<KeyValuePair<S, IEnumerable<T>>> GroupSeqsBy<T, S>(this IEnumerable<T> list, Func<T, S> keySelector)
{
    List<T> retList = new List<T>();
    S prev = keySelector(list.FirstOrDefault());
    foreach (T item in list)
    {
        if (keySelector(item).Equals(prev))
            retList.Add(item);
        else
        {
            yield return new KeyValuePair<S, IEnumerable<T>>(prev, retList);
            prev = keySelector(item);
            retList = new List<T>();
            retList.Add(item);
        }
    }
    if (retList.Count > 0)
        yield return new KeyValuePair<S, IEnumerable<T>>(prev, retList);
}
于 2013-09-08T11:42:59.487 回答
5

使用正则表达式你可以做到这一点(假设你的例子有一个错字,其中 c3 应该是 c4)

static readonly Regex re = new Regex( @"(.)\1*", RegexOptions.Compiled );               
static void Main()
{
    string result = re.Replace( "aabccccdddddddddddaa", match => match.Value[0] + match.Length.ToString() );                        
    Console.WriteLine( result );
}

输出是:

a2b1c4d11a2

基本上,我们正在搜索任何重复 0 次或更多次的字符,然后将其替换为该字符后跟匹配字符串的长度。

具体来说: -

  • . 匹配任何字符(\n 除外)。
  • (.) 圆括号进行分组
  • \1 是对该组的编号反向引用,基本上是重新使用已经匹配的字符。
  • * 是一个重复运算符,表示重复匹配 0 次或更多次。我们也可以使用 {0,}。

一些有用的链接是: 分组| 重复| C# 正则表达式| C# 匹配代理| C# 正则表达式快速参考| C# 反向引用

如果您希望这是字符串的扩展(不确定这是否是要求),那么:

public static class StringExtensions
{ 
    static readonly Regex re = new Regex( @"(.)\1*", RegexOptions.Compiled );                
    public static string Compress(this string theString)
    {
        return re.Replace( theString, match => match.Value[0] + match.Length.ToString() );             
    }
}

使用如下:

string theString = "aabccccdddddddddddaa";
string result = theString.Compress();
于 2013-09-08T13:58:59.553 回答