0

我想从标签中提取内容<dd>,我想在其中获取 p 标签内容和 ul 标签内容我尝试在 php 中使用 preg_match_all 来获取该 html 页面中的所有内容<dd>,但什么也没得到这是我的 html 代码

<dd style="display: block;">
                                    <p>Lightweight, comfy and cool - the dressy shirt he won\'t mind wearing!</p>
                                    <ul>
                                        <li>Made of 100% cotton</li>                        
                                        <li>Specially treated for a soft feel</li>                      
                                        <li>Classically styled with a pointed collar and button front</li>                      
                                        <li>Chest pocket; curved shirttail hem</li>                     
                                        <li>Canvas taping at inner neck</li>                        
                                        <li>Imported</li>                       
                                    </ul>


                                    <div id="BVSecondaryCustomerRatings" style="display:none;margin-left: 15px" class="BVBrowserWebkit"> <div class="BVRRRootElement">
<div class="BVRRRatingSummary BVRRSecondaryRatingSummary">
<div class="BVRRRatingSummary BVRRPrimaryRatingSummary"><div class="BVRRRatingSummaryStyle2"><div class="BVRRRatingSummaryNoReviews"> <div id="BVRRRatingSummaryNoReviewsWriteImageLinkID" class="BVRRRatingSummaryLink BVRRRatingSummaryNoReviewsWriteImageLink">
<a name="BV_TrackingTag_Rating_Summary_2_WriteReview_I2613L0022" target="BVFrame" href="http://reviews.childrensplace.com/4154/I2613L0022/writereview.htm?format=embedded&amp;campaignid=BV_RATING_SUMMARY_ZERO_REVIEWS&amp;sessionparams=__BVSESSIONPARAMS__&amp;return=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476_827676_26601%257C72469%257C813599_boy%257Coutfits%257Cplaid%2520patrol_boy&amp;innerreturn=http%3A%2F%2Freviews.childrensplace.com%2F4154%2FI2613L0022%2Freviews.htm%3Fformat%3Dembedded&amp;user=__USERID__&amp;authsourcetype=__AUTHTYPE__&amp;submissionparams=__BVSUBMISSIONPARAMETERS__&amp;submissionurl=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2FTCPCheckUserAuthenticationCmd%3FlangId%3D-1%26catalogId%3D10001%26storeId%3D10001"> <img src="http://reviews.childrensplace.com/static/4154/translucent.gif" alt="Write a review">
</a> </div>
<div id="BVRRRatingSummaryLinkWriteFirstID" class="BVRRRatingSummaryLink BVRRRatingSummaryLinkWriteFirst">
<span class="BVRRRatingSummaryLinkWriteFirstPrefix">Be the first to review this item.</span>
<a name="BV_TrackingTag_Rating_Summary_2_SocialBookmarkKaboodle_I2613L0022" target="_blank" class="BVRRSocialBookmarkingSharingLink BVRRSocialBookmarkingSharingLinkKaboodle" onclick="this.href=bvReplaceTokensInSocialURL(this.href);window.open(this.href,'','left=0,top=0,width=795,height=700,toolbar=1,location=0,resizable=1,scrollbars=1'); return false;" onfocus="this.href=bvReplaceTokensInSocialURL(this.href);" rel="nofollow" href="http://reviews.childrensplace.com/4154/share.htm?site=Kaboodle&amp;url=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476&amp;title=__TITLE__&amp;robot=__ROBOT__&amp;image=http%3A%2F%2Fcontent.childrensplace.com%2Fwww%2Fb%2FTCP%2Fimages%2Fstyles%2F188410_m.jpg" onmouseover="this.href=bvReplaceTokensInSocialURL(this.href);"><img width="16" height="16" class="BVRRSocialBookmarkLinkImage" src="http://reviews.childrensplace.com/static/4154/link-kaboodle.gif" alt="Kaboodle" title="Add To Kaboodle"></a>
</div></div></div></div> </div>
</div>
                                    <p class="TCP-Phrase">Big Fashion, Little Prices</p>

                                    <div id="product_social_icons" style="height: 20px;">








                                            <div class="social_icon current_social">
                                                <div class="twitter"><iframe scrolling="no" frameborder="0" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.1336551279.html#_=1336767195241&amp;count=horizontal&amp;id=twitter-widget-0&amp;lang=en&amp;original_referer=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476&amp;size=m&amp;text=The Childrens Place - plaid shirt&amp;url=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476" class="twitter-share-button twitter-count-horizontal" style="height: 20px; width: 90px;" title="Twitter Tweet Button"></iframe></div>
                                                <div class="pinterest" id="pin_it">
                                                    <iframe scrolling="no" frameborder="0" src="http://pinit-cdn.pinterest.com/pinit.html?url=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476&amp;media=//content.childrensplace.com/www/b/TCP/images/cloudzoom/p/188410_p.jpg&amp;description=plaid shirt&amp;layout=horizontal" style="border: medium none; width: 90px; height: 20px;"></iframe>
                                                </div>
                                                <div class="fb-like-btn" id="fb-root">
                                                    <script src="//connect.facebook.net/en_US/all.js#xfbml=1"></script>
                                                    <fb:like layout="button_count" show_faces="false" width="90" action="like" font="arial" colorscheme="light" fb-xfbml-state="rendered" class="fb_edge_widget_with_comment fb_iframe_widget"><span style="height: 20px; width: 76px;"><iframe id="f111d3371c" name="f5f7b234c" scrolling="no" style="border: none; overflow: hidden; height: 20px; width: 76px;" title="Like this content on Facebook." class="fb_ltr" src="http://www.facebook.com/plugins/like.php?api_key=&amp;locale=en_US&amp;sdk=joey&amp;channel_url=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter.php%3Fversion%3D23%23cb%3Df11898a314%26origin%3Dhttp%253A%252F%252Fwww.childrensplace.com%252Ff210aed7%26domain%3Dwww.childrensplace.com%26relation%3Dparent.parent&amp;href=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476_827676_26601%257C72469%257C813599_boy%257Coutfits%257Cplaid%2520patrol_boy&amp;node_type=link&amp;width=90&amp;font=arial&amp;layout=button_count&amp;colorscheme=light&amp;action=like&amp;show_faces=false&amp;extended_social_context=false"></iframe></span></fb:like></div>
                                            </div>



                                    </div>
                                </dd>

我用谷歌搜索了很多以找出这个问题我尝试使用 dom 解析但客户端需要正则表达式解析而不是那个..

4

2 回答 2

1

这是一个答案,它不会告诉您您的方法在道德上是错误的:

$pattern = "/<dd.*?>.*?<p>(.*?)<\/p>.*?<ul>(.*?)<\/ul>/s";
if (preg_match($pattern, $html, $matches)) {
    echo "P-tag content: ".$matches[1];
    echo "<br>";
    echo "UL-tag content: ".$matches[2];
}

我使用您发布的 HTML 对其进行了测试,并且可以正常工作。

于 2013-04-25T09:26:27.800 回答
1

不要使用正则表达式来解析 html,这是错误的。尝试改用 simplexml,如果这对您来说太多了,请尝试查询路径:http: //querypath.org/

于 2013-04-25T09:03:33.427 回答