我已经给出了带有 html 标签列表的 XML 字符串,例如“ <p>, <a>, <img>, <link>
”等。
现在我想创建一个通用函数,我将在其中传递 html 标签列表,或者也可以是一个标签,我想从传递的 XML 字符串中排除它。函数将返回整个字符串而不包含排除的标签。
public const String[] htmlTags = new String[] { "<p>", "a", "img" };
string result = strString.ExcludeHTMLTags(htmlTags); //I will write the String extension not an issue, please suggest how to exclude tags from exisiting string.
编辑:
我正在尝试以下代码:
/// <summary>
/// Remove HTML tags from string using char array.
/// </summary>
public static string StripTagsCharArray(string source, String[] htmlTags)
{
char[] array = new char[source.Length];
int arrayIndex = 0;
bool inside = false;
for (int i = 0; i < source.Length; i++)
{
foreach (String htmlTag in htmlTags)
{
char let = source[i];
String tag = "<" + "htmlTag"; //How to handle this as this is character
if (let == tag)
{
inside = true;
continue;
}
if (let == '>')
{
inside = false;
continue;
}
if (!inside)
{
array[arrayIndex] = let;
arrayIndex++;
}
}
}
return new string(array, 0, arrayIndex);
}
编辑 2:使用正则表达式
String[] htmlTags = new String[] { "a", "img", "p" };
private const string STR_RemoveHtmlTagRegex = "</?{0}[^<]*?>";
public static string RemoveHtmlTag(String input, String[] htmlTags)
{
String strResult = String.Empty;
foreach (String htmlTag in htmlTags)
{
Regex reg = new Regex(String.Format(STR_RemoveHtmlTagRegex, htmlTag.Trim()), RegexOptions.IgnoreCase);
strResult = reg.Replace(input, String.Empty);
input = strResult;
}
return strResult;
}
现在的问题是它没有删除标签的值,所以如果有“测试
" 然后它返回 "Testing",我也想删除带有值的整个标签。