0

我有一个这样的html代码:

<div class="address">
    <strong>Max Mustermann  </strong>             
    <br>Secondstreet 12          
    <br>1234 New York     
    <br>                      
    <br>                     
    <br>                     
</div>

这是我的代码:

    html = html.replace("<br>", "br34k");
    Document doc = Jsoup.parse(html);

    Elements divs = doc.select("div.address");

    StringBuilder divResult = new StringBuilder();
    for (Element div : divs) {
        divResult.append(div.text());
    }
    String result = divResult.toString();

    result = completeResults.replace("br34k", System.getProperty("line.separator"));

    System.out.println(result);

有了这个输出看起来像:

06-18 20:00:30.290: I/System.out(623): Cafe Palio 
06-18 20:00:30.290: I/System.out(623): Marktplatz 1 
06-18 20:00:30.290: I/System.out(623): 79312 Emmendingen 
06-18 20:00:30.290: I/System.out(623):  
06-18 20:00:30.290: I/System.out(623):  
06-18 20:00:30.300: I/System.out(623): Domino Stüble 
06-18 20:00:30.300: I/System.out(623): Markgrafenstr. 57 
06-18 20:00:30.300: I/System.out(623): 79312 Emmendingen 
06-18 20:00:30.300: I/System.out(623):  
06-18 20:00:30.300: I/System.out(623):  
06-18 20:00:30.300: I/System.out(623): Pizza Boxx 
06-18 20:00:30.300: I/System.out(623): Am Elzdamm 66 
06-18 20:00:30.300: I/System.out(623): 79312 Emmendingen 

但我需要的是一个没有名称的字符串,例如:

市场广场 1 79312 埃门丁根

Markgrafenstr. 57 79312 埃门丁根

等等..

4

2 回答 2

0

如果您的 HTML 标记正确,这将是最简单的:

<div class="address">
    <strong id="name">Max Mustermann  </strong>             
    <span id="address-part-one">Secondstreet 12</span>          
    <span id="address-part-two">1234 New York</span>                         
</div>

这里不需要br标签,应该使用 CSS。分别检索地址标签内容,然后将其连接起来。

于 2013-06-18T12:10:06.157 回答
0

如果需要,您可以优化周围的代码String

Document document = Jsoup.parse(content);
    String text = document.select(".address").text();
    String title = document.select(".address strong").text();
    String output = text.replaceFirst(title, "").trim();
    System.out.println(output);

更新了答案以反映更新后的问题

如果您有多个<div>'s with,则此代码有效class="address"

Elements elements = document.select(".address");
    for (Iterator<Element> iterator = elements.iterator(); iterator.hasNext();)
    {
        Element element = iterator.next();
        String text = element.text();
        String title = element.select("strong").text();
        String output = text.replaceFirst(title, "").trim();
        System.out.println(output);

    }
于 2013-06-18T15:46:28.707 回答