4

我在 C# 中有以下字符串。

"aaa,bbbb.ccc|dddd:eee"

然后我用new char[] {',','.','|',':'}. 如何以与以前相同的顺序使用相同的字符重新加入此字符串?因此,该列表最终将与以前完全相同。

例子

string s = "aaa,bbbb.ccc|dddd:eee";
string[] s2 = s.Split(new char[] {',','.','|',':'});
// now s2 = {"aaa", "bbbb", "ccc", "dddd", "eee"}
// lets assume I done some operation, and
// now s2 = {"xxx", "yyy", "zzz", "1111", "222"}

s = s2.MagicJoin(~~~~~~);  // I need this

// now s = "xxx,yyy.zzz|1111:222";

编辑

上面的char[]示例只是示例,在现实世界中不会以相同的顺序甚至不会同时出现。

编辑

只是一个想法,使用 Regex.split 怎么样,然后首先通过char[]get a拆分string[],然后使用not the char[]to split get another string[],然后将它们放回去。也许工作,但我不知道如何编码。

4

4 回答 4

3

给你 - 这适用于任何顺序的任何分隔符组合,也允许在字符串中实际上没有找到分隔符的情况。我花了一段时间才想出这个,并且发布后,它看起来比任何其他答案都复杂!

啊,好吧,我还是把它留在这里。

public static string SplitAndReJoin(string str, char[] delimiters, 
  Func<string[], string[]> mutator)
{
  //first thing to know is which of the delimiters are 
  //actually in the string, and in what order
  //Using ToArray() here to get the total count of found delimiters
  var delimitersInOrder = (from ci in
                            (from c in delimiters
                             from i in FindIndexesOfAll(str, c)
                             select new { c, i })
                          orderby ci.i
                          select ci.c).ToArray();
  if (delimitersInOrder.Length == 0)
    return str;

  //now split and mutate the string
  string[] strings = str.Split(delimiters);
  strings = mutator(strings);
  //now build a format string
  //note - this operation is much more complicated if you wish to use 
  //StringSplitOptions.RemoveEmptyEntries
  string formatStr = string.Join("",
    delimitersInOrder.Select((c, i) => string.Format("{{{0}}}", i)
      + c));
  //deals with the 'perfect' split - i.e. there's always two values
  //either side of a delimiter
  if (strings.Length > delimitersInOrder.Length)
    formatStr += string.Format("{{{0}}}", strings.Length - 1);

  return string.Format(formatStr, strings);
}

public static IEnumerable<int> FindIndexesOfAll(string str, char c)
{
  int startIndex = 0;
  int lastIndex = -1;

  while(true)
  {
    lastIndex = str.IndexOf(c, startIndex);
    if (lastIndex != -1)
    {
      yield return lastIndex;
      startIndex = lastIndex + 1;
    }
    else
      yield break;
  }
}

这是一个可以用来验证它的测试:

[TestMethod]
public void TestSplitAndReJoin()
{
  //note - mutator does nothing
  Assert.AreEqual("a,b", SplitAndReJoin("a,b", ",".ToCharArray(), s => s));
  //insert a 'z' in front of every sub string.
  Assert.AreEqual("zaaa,zbbbb.zccc|zdddd:zeee", SplitAndReJoin("aaa,bbbb.ccc|dddd:eee",
    ",.|:".ToCharArray(), s => s.Select(ss => "z" + ss).ToArray()));
  //re-ordering of delimiters + mutate
  Assert.AreEqual("zaaa,zbbbb.zccc|zdddd:zeee", SplitAndReJoin("aaa,bbbb.ccc|dddd:eee",
    ":|.,".ToCharArray(), s => s.Select(ss => "z" + ss).ToArray()));
  //now how about leading or trailing results?
  Assert.AreEqual("a,", SplitAndReJoin("a,", ",".ToCharArray(), s => s));
  Assert.AreEqual(",b", SplitAndReJoin(",b", ",".ToCharArray(), s => s));
}

请注意,我假设您需要能够对数组的元素做一些事情,在将它们重新组合在一起之前操纵各个字符串 - 否则您可能只会保留原始字符串!

该方法构建动态格式字符串。这里不保证效率:)

于 2012-05-09T23:09:05.663 回答
3

这是MagicSplit

public IEnumerable<Tuple<string,char>> MagicSplit(string input, char[] split)
{    
    var buffer = new StringBuilder();
    foreach (var c in input)
    {
        if (split.Contains(c)) 
        {
            var result = buffer.ToString();
            buffer.Clear();
            yield return Tuple.Create(result,c);
        }
        else
        {
            buffer.Append(c);
        }
    }
    yield return Tuple.Create(buffer.ToString(),' ');
}

和两种类型MagicJoin

public string MagicJoin(IEnumerable<Tuple<string,char>> split)
{
    return split.Aggregate(new StringBuilder(), (sb, tup) => sb.Append(tup.Item1).Append(tup.Item2)).ToString();
}

public string MagicJoin(IEnumerable<string> strings, IEnumerable<char> chars)
{
    return strings.Zip(chars, (s,c) => s + c.ToString()).Aggregate(new StringBuilder(), (sb, s) => sb.Append(s)).ToString();
}

用途:

var s = "aaa,bbbb.ccc|dddd:eee";

// simple
var split = MagicSplit(s, new char[] {',','.','|',':'}).ToArray();
var joined = MagicJoin(split);    

// if you want to change the strings
var strings = split.Select(tup => tup.Item1).ToArray();
var chars = split.Select(tup => tup.Item2).ToArray();
strings[0] = "test";
var joined = MagicJoin(strings,chars);
于 2012-05-09T23:35:52.750 回答
3

使用 Regex 类可能更容易做到这一点:

input = Regex.Replace(input, @"[^,.|:]+", DoSomething);

其中 DoSomething 是转换相关项目的方法或 lambda,例如:

string DoSomething(Match m)
{
    return m.Value.ToUpper();
}

对于此示例,“aaa,bbbb.ccc|dddd:eee”的输出字符串将为“AAA,BBBB.CCC|DDDD:EEE”。

如果您使用 lambda,您可以非常轻松地保持状态,如下所示:

int i = 0;
Console.WriteLine(Regex.Replace("aaa,bbbb.ccc|dddd:eee", @"[^,.|:]+",
    _ => (++i).ToString()));

输出:

1,2.3|4:5

这仅取决于您对项目进行的转换类型。

于 2012-05-09T23:36:54.497 回答
1

这个怎么样?


var x = "aaa,bbbb.ccc|dddd:eee";

var matches = Regex.Matches(x, "(?<Value>[^\\.,|\\:]+)(?<Separator>[\\.,|\\:]?)");

var result = new StringBuilder();

foreach (Match match in matches)
{
    result.AppendFormat("{0}{1}", match.Groups["Value"], match.Groups["Separator"]);
}

Console.WriteLine(result.ToString());
Console.ReadLine();

或者,如果您喜欢 LINQ(我喜欢):


var x = "aaa,bbbb.ccc|dddd:eee";
var matches = Regex.Matches(x, "(?<Value>[^\\.,|\\:]+)(?<Separator>[\\.,|\\:]?)");
var reassembly = matches.Cast<Match>().Aggregate(new StringBuilder(), (a, v) => a.AppendFormat("{0}{1}", v.Groups["Value"], v.Groups["Separator"])).ToString();
Console.WriteLine(reassembly);
Console.ReadLine();

不用说,您可以在重新组装之前对零件做一些事情,我认为这是本练习的重点

于 2012-05-09T23:39:31.040 回答