与其尝试用过滤后的文本替换换行符,然后尝试在其上使用正则表达式,为什么不在插入<br />
标签之前清理您的数据呢?不要忘记先清理输入HttpUtility.HtmlEncode
。
为了尝试连续处理多条短线,这是我最好的尝试:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
class Program {
static void Main() {
// Arbirary cutoff used to join short strings.
const int Cutoff = 6;
string input =
"\r\n\r\n\n\r\r\r\n\nthisisatest\r\nstring\r\nwith\nsome\r\n" +
"unsanatized\r\nbreaks\r\nand\ra\nsh\nor\nt\r\n\na\na\na\na" +
"\na\na\na\na\na\na\na\na\na\na\na\na\na\na\na\na\na";
input = (input ?? String.Empty).Trim(); // Don't forget to HtmlEncode it.
StringBuilder temp = new StringBuilder();
List<string> result = new List<string>();
var items = input.Split(
new[] { '\r', '\n' },
StringSplitOptions.RemoveEmptyEntries)
.Select(i => new { i.Length, Value = i });
foreach (var item in items) {
if (item.Length > Cutoff) {
if (temp.Length > 0) {
result.Add(temp.ToString());
temp.Clear();
}
result.Add(item.Value);
continue;
}
if (temp.Length > 0) { temp.Append(" "); }
temp.Append(item.Value);
}
if (temp.Length > 0) {
result.Add(temp.ToString());
}
Console.WriteLine(String.Join("<br />", result));
}
}
产生以下输出:
thisisatest<br />string with some<br />unsanatized<br />breaks and a sh or t a a
a a a a a a a a a a a a a a a a a a a
我敢肯定你已经想出了这个解决方案,但不幸的是你所要求的并不是很直接。
对于那些感兴趣的人,这是我的第一次尝试:
using System;
using System.Text.RegularExpressions;
class Program {
static void Main() {
string input = "\r\n\r\n\n\r\r\r\n\nthisisatest\r\nstring\r\nwith\nsome" +
"\r\nunsanatized\r\nbreaks\r\n\r\n";
input = (input ?? String.Empty).Trim().Replace("\r", String.Empty);
string output = Regex.Replace(
input,
"\\\n+",
"<br />",
RegexOptions.Multiline);
Console.WriteLine(output);
}
}
产生以下输出:
thisisatest<br />string<br />with<br />some<br />unsanatized<br />breaks