我想以数组形式或 xml 格式获取我的 html 数据,以便可以轻松地将其保存在数据库中。这是我到目前为止的工作:
$url = "http://www.example.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
if($html = curl_exec($ch)){
// parse the html into a DOMDocument
$dom = new DOMDocument();
$dom->recover = true;
$dom->strictErrorChecking = false;
@$dom->loadHTML($html);
$hrefs = $dom->getElementsByTagName('div');
curl_close($ch);
}else{
echo "The website could not be reached.";
}
我应该怎么做才能以数组形式或 xml 格式获取 html。来的html是这样的:
<div>
<ul>
<li>Product Name</li>
<li>Category</li>
<li>Subcategory</li>
<li>Product Price</li>
<li>Product Company</li>
</ul>
</div>