php - 使用正则表达式将通配符数据转换为多维数组

Question

所以我正在获取一个 html 文件的内容，并且我想将 html 元素中的特定信息提取到一个多维数组中。问题是我对正则表达式没有太多经验。

列表中有许多艺术家，这就是每个艺术家的格式。

<li class="artist"><a href="*I NEED THIS PATH*">*AND THIS TEXT*</a></li>

这是我到目前为止所拥有的：

$contents = file_get_contents('somefile.txt'); 
$artists = preg_split('/^<li class="artist"><a href="(.*)">(.*)<\/a><\/li>$/', $contents);
$artistInfo = array();

foreach( $artists as $artist ) :

    preg_match('/href="(.*)">/', $element, $matchPath); // link paths
    preg_match('/">(.*)<\/a><\/li>/', $element, $matchName); // artist names

    array_push( $artistInfo, array( $matchName, $matchPath ) ); // put info into array

endforeach;

print_r($artistInfo);

preg_split 没有像我希望的那样工作，所以它把其他所有东西都扔掉了，但我也不知道我的 preg_match 表达式是否正确。请帮忙！

score 2 · Accepted Answer

不要为此使用正则表达式。DOMDocument是你的朋友：

$artistInfo = array();
$dom = new DOMDocument;
$dom->loadHTML( file_get_contents('somefile.txt') );

$xPath = new DOMXpath($dom);

foreach ( $xPath->query('//li[@class="artist"]/a') as $anchor ) {
    $artistInfo[] = array(
        $anchor->textContent,
        $anchor->getAttribute('href')
    );
}

在这里查看它的实际操作：http ://codepad.viper-7.com/NziHBo

php - 使用正则表达式将通配符数据转换为多维数组

1 回答 1

Related

Reference