java - Java IndexOf 找不到正确的数据

Question

我有一个需要从 HTML 页面解析 HTML 元素的 Java 应用程序。我的简单 HTML 测试设置如下：

<!DOCTYPE html>
<html>
<head>
<style type='text/css'>
  div {width:100%;height:100px;background-color:blue;}
</style>
</head>
<body>
  <div></div>
</body>
</html>

我的代码将被设置为它将在文档中搜索此字符串：“<style”

然后搜索关闭的胡萝卜：“>”，因为用户可能已经为他们的 HTML 文件键入了这些组合中的任何一个：

<style type="text/css">

or

<style type = "text/css" >

or

<style type = 'text/css' >

or 

<style type='text/css'>

etc..

所以我的方法是找到“风格”标签和所有东西，直到它的结束胡萝卜

然后找到结束样式标签：

</style>

然后抓取这两个实体之间的所有内容。

这是我的文件及其代码：

************strings.xml************

String txt_style_opentag = "<style"
String txt_end_carrot = ">"
String txt_style_closetag = "</style>"

***********************************





************Parser.java************
public static String getStyle(Context context, String text) {
    String style = "";

    String openTag = context.getString(R.string.txt_style_opentag);
    String closeTag = context.getString(R.string.txt_style_closetag);
    String endCarrot = context.getString(R.string.txt_end_carrot);

    int openPos1 = text.indexOf(openTag);
    int openPos = text.indexOf(endCarrot, openPos1);
    int closePos = text.indexOf(closeTag, openPos1);

    if (openPos != -1 && closePos != -1)
        style = text.substring(openPos + openTag.length(), closePos).trim();

    if (style != null && style.length() > 0 && style.charAt(0) == '\n')     // first \n remove
        style = style.substring(1, style.length());

    if (style != null && style.length() > 0 && style.charAt(style.length() - 1) == '\n')    // last \n remove
        style = style.substring(0, style.length() - 1);

    return style;
}
********************************************************

我的结果很接近，但不正确。结果是这样的：

{width:100%;height:100px;background-color:blue;}

如果您注意到，它缺少“div”部分。它应该如下所示：

div {width:100%;height:100px;background-color:blue;}

我在这里做错了什么。任何人都可以帮忙吗？

score 1 · Accepted Answer

您从开始标签（右括号>）的末尾获取子字符串并添加开始标签的长度（而不是endCarrot），从而将子字符串的开始移动到您想要的位置之前。你想做

style = text.substring(openPos + endCarrot.length(), closePos).trim();

score 0 · Accepted Answer

当然......在我寻求帮助之后，我终于弄清楚了。以下代码应更改

从：

style = text.substring(openPos + openTag.length(), closePos).trim();

到：

style = text.substring(openPos + endCarrot.length(), closePos).trim();

对不起这个帖子。并感谢您的建议

java - Java IndexOf 找不到正确的数据

2 回答 2

Related

Reference