3

我是一名初级程序员,正在尝试从该来源捕获数据

这是我要捕获的特定部分:

<ul class="ingredient-wrap">

            <li id="liIngredient" data-ingredientid="3914" data-grams="907.2">
                <label>
                    <span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl01$cbxIngredient" /></span>
                    <p class="fl-ing" itemprop="ingredients">
                        <span id="lblIngAmount" class="ingredient-amount">2 pounds</span>
                        <span id="lblIngName" class="ingredient-name">ground beef chuck</span>

                    </p>
                </label>
            </li>

            <li id="liIngredient" data-ingredientid="5838" data-grams="454">
                <label>
                    <span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl02$cbxIngredient" /></span>
                    <p class="fl-ing" itemprop="ingredients">
                        <span id="lblIngAmount" class="ingredient-amount">1 pound</span>
                        <span id="lblIngName" class="ingredient-name">bulk Italian sausage</span>

                    </p>
                </label>
            </li>

            <li id="liIngredient" data-ingredientid="10429" data-grams="1278">
                <label>
                    <span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl03$cbxIngredient" /></span>
                    <p class="fl-ing" itemprop="ingredients">
                        <span id="lblIngAmount" class="ingredient-amount">3 (15 ounce) cans</span>
                        <span id="lblIngName" class="ingredient-name">chili beans, drained</span>

                    </p>
                </label>
            </li>

每个 li 包含两组单词,例如:3 (15 ounce) cans and chili beans, drained我正在尝试使用 foreach 循环从每个 li 中获取两组单词,然后组合并保存到数据库中.

这是我的代码:

foreach($html->find(".ingredient-wrap", 0)->children as $e){
              $ingredients = $e->plaintext;
              echo trim($ingredients);
              $hostname = 'localhost';
              $username = '********';
              $password = '*******';
              $conn = new PDO("mysql:host=$hostname;dbname=*********", $username, $password);
              $sql = ("INSERT INTO ingredients (recipe_id, ingredientname) VALUES (?, ?)");
              $q = $conn->prepare($sql);
              $q->execute(array($recipe_id,$ingredients));
          }

这样做的问题是,在插入数据库后,每个成分名称的值都是...,即使您回显出来,echo $ingredients."<br/>"您也会看到一个组合词的列表,后面有一个空格。

感谢您的任何帮助!如果您有任何问题或需要更多澄清,我在这里回复!

4

2 回答 2

1

获得“清单”是正常的。你是->innertext用来获取成分的。本质上,您正在striptags()处理包含成分的 html,只留下一些裸文本。您应该分别循环遍历每个成分标签。

于 2012-12-24T03:51:12.027 回答
-1

可以试试正则...

preg_match_all('/<span id="lblIngAmount" class="ingredient-amount">(.*)<\/span>\s+<span id="lblIngName" class="ingredient-name">(.*)<\/span>/', $ingredients, $matches, PREG_PATTERN_ORDER);

返回:

Array
(
[0] => Array
    (
        [0] => <span id="lblIngAmount" class="ingredient-amount">2 pounds</span>
                    <span id="lblIngName" class="ingredient-name">ground beef chuck</span>
        [1] => <span id="lblIngAmount" class="ingredient-amount">1 pound</span>
                    <span id="lblIngName" class="ingredient-name">bulk Italian sausage</span>
        [2] => <span id="lblIngAmount" class="ingredient-amount">3 (15 ounce) cans</span>
                    <span id="lblIngName" class="ingredient-name">chili beans, drained</span>
    )

[1] => Array
    (
        [0] => 2 pounds
        [1] => 1 pound
        [2] => 3 (15 ounce) cans
    )

[2] => Array
    (
        [0] => ground beef chuck
        [1] => bulk Italian sausage
        [2] => chili beans, drained
    )

)

所以:

echo $matches[2][0].": ".$matches[1][0];

会给:

ground beef chuck: 2 pounds
于 2012-12-24T03:53:30.930 回答