2

我们现在尝试了一些尝试使用 XML 解析器的解决方案。全部失败,因为字符串并不总是 100% 有效的 XML。这是我们的问题。

我们的字符串如下所示:

var a = "this is a testxxx of my data yxxx and of these xxx parts yxxx";
var b = "hello testxxx world yxxx ";

"this is a testxxx3yxxx and of these xxx1yxxx";
"hello testxxx1yxxx ";

这里的关键是我们要对 xxx 和 yxxx 之间的数据做一些事情。在上面的示例中,我需要一个计算字数并用字数替换字符串的函数。

有没有一种方法可以处理字符串 a 并应用函数来更改 xxx 和 yxxx 之间的数据?现在任何函数,因为我们只是想了解如何编写代码。

4

6 回答 6

3

如果它总是去 xxx 和 yxxx,你可以按照建议使用正则表达式。

var stringBuilder = new StringBuilder();
Regex regex = new Regex("xxx(.*?)yxxx");
var splitGroups = Regex.Match(a);

foreach(var group in splitGroups)
{
    var value  = splitGroupsCopy[i];

    // do something to value and then append it to string builder

   stringBuilder.Append(string.Format("{0}{1}{2}", "xxx", value, "yxxx")); 

}    

我想这是最基本的。

于 2012-10-05T08:07:05.800 回答
3

您可以使用Split方法:

 var parts = a.Split(new[] {"xxx", "yxxx"}, StringSplitOptions.None)
            .Select((s, index) =>
                {
                    string s1 = index%2 == 1 ? string.Format("{0}{2}{1}", "xxx", "yxxx", s + "1") : s;
                    return s1;
                });

  var result = string.Join("", parts);
于 2012-10-05T08:24:27.137 回答
1

indexOf()函数将向您返回给定子字符串第一次出现的索引。

(我的索引可能有点偏离,但是)我建议做这样的事情:

var searchme = "this is a testxxx of my data yxxx and there are many of these xxx parts yxxx";

var startindex= searchme.indexOf("xxx");
var endindex = searchme.indexOf("yxxx") + 3; //added 3 to find the index of the last 'x' instead of the index of the 'y' character

var stringpiece = searchme.substring(startindex, endindex - startindex);

你可以重复一遍startindex != -1

就像我说的那样,索引可能会稍微偏离,您可能需要在某处添加 +1 或 -1,但这会让您相处得很好(我认为)。


这是一个计算字符而不是单词的小示例程序。但是您应该只需要更改处理器功能。

var a = "this is a testxxx of my data yxxx and there are many of these xxx parts yxxx";
a = ProcessString(a, CountChars);


string CountChars(string a)
{
    return a.Length.ToString();
}

string ProcessString(string a, Func<string, string> processor)
{
    int idx_start, idx_end = -4;
    while ((idx_start = a.IndexOf("xxx", idx_end + 4)) >= 0)
    {
        idx_end = a.IndexOf("yxxx", idx_start + 3);
        if (idx_end < 0)
            break;

        var string_in_between = a.Substring(idx_start + 3, idx_end - idx_start - 3);
        var newString = processor(string_in_between);

        a = a.Substring(0, idx_start + 3) + newString + a.Substring(idx_end, a.Length - idx_end);

        idx_end -= string_in_between.Length - newString.Length;
    }
    return a;
}
于 2012-10-05T08:05:16.013 回答
1

使用 Regex.Replace 将用您选择的文本替换所有匹配项,如下所示:

Regex rgx = new Regex("xxx.+yxxx");
string cleaned = rgx.Replace(a, "replacementtext");
于 2012-10-05T08:12:11.933 回答
1

此代码将处理由“xxx”分隔的每个部分。它保留了“xxx”分隔符。如果您不想保留“xxx”分隔符,请删除显示“result.Append(separator);”的两行。

鉴于:

"this is a testxxx of my data yxxx and there are many of these xxx parts yxxx"

它打印:

"this is a testxxx>> of my data y<<xxx and there are many of these xxx>> parts y<<xxx"

我假设这就是你想要的那种东西。将您自己的处理添加到“processPart()”。

using System;
using System.Text;

namespace ConsoleApplication1
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            string text = "this is a testxxx of my data yxxx and there are many of these xxx parts yxxx";
            string separator = "xxx";
            var result = new StringBuilder();

            int index = 0;

            while (true)
            {
                int start = text.IndexOf(separator, index);

                if (start < 0)
                {
                    result.Append(text.Substring(index));
                    break;
                }

                result.Append(text.Substring(index, start - index));

                int end = text.IndexOf(separator, start + separator.Length);

                if (end < 0)
                {
                    throw new InvalidOperationException("Unbalanced separators.");
                }

                start += separator.Length;

                result.Append(separator);
                result.Append(processPart(text.Substring(start, end-start)));
                result.Append(separator);

                index = end + separator.Length;
            }

            Console.WriteLine(result);
        }

        private static string processPart(string part)
        {
            return ">>" + part + "<<";
        }
    }
}

[编辑] 这是修改为使用两种不同分隔符的代码:

using System;
using System.Text;

namespace ConsoleApplication1
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            string text = "this is a test<pre> of my data y</pre> and there are many of these <pre> parts y</pre>";
            string separator1 = "<pre>";
            string separator2 = "</pre>";
            var result = new StringBuilder();

            int index = 0;

            while (true)
            {
                int start = text.IndexOf(separator1, index);

                if (start < 0)
                {
                    result.Append(text.Substring(index));
                    break;
                }

                result.Append(text.Substring(index, start - index));

                int end = text.IndexOf(separator2, start + separator1.Length);

                if (end < 0)
                {
                    throw new InvalidOperationException("Unbalanced separators.");
                }

                start += separator1.Length;

                result.Append(separator1);
                result.Append(processPart(text.Substring(start, end-start)));
                result.Append(separator2);

                index = end + separator2.Length;
            }

            Console.WriteLine(result);
        }

        private static string processPart(string part)
        {
            return "|" + part + "|";
        }
    }
}
于 2012-10-05T08:19:26.790 回答
1

我会使用正则表达式组:

这是我获取字符串中部分的解决方案:

private static IEnumerable<string> GetParts( string searchFor, string begin, string end ) {
    string exp = string.Format("({0}(?<searchedPart>.+?){1})+", begin, end);
    Regex regex = new Regex(exp);
    MatchCollection matchCollection = regex.Matches(searchFor);
    foreach (Match match in matchCollection) {
        Group @group = match.Groups["searchedPart"];
        yield return @group.ToString();
    }
}

您可以像获取零件一样使用它:

string a = "this is a testxxx of my data yxxx and there are many of these xxx parts yxxx";

IEnumerable<string> parts = GetParts(a, "xxx", "yxxx");

要替换原始字符串中的部分,您可以使用 Regex Group 来确定 Length 和 StartPosition(@group.Index、@group.Length)。

于 2012-10-05T08:30:15.510 回答