0

目前我有以下代码:

    //loop here 
    foreach ($doc['a'] as $link) {
        $href = pq($link)->attr('href');                
        if (preg_match($url,$href))
        {
            //delete matched string and append custom url to href attr
        }       
        else
        {
            //prepend custom url to href attr
        }
    }
    //end loop

基本上我已经拿到了一个外部页面的小瓶卷曲。我需要将我自己的自定义 URL 附加到 DOM 中的每个 href 链接。我需要通过正则表达式检查每个 href attr 是否已经有一个基本网址,例如 www.domain.com/MainPage.html/SubPage.html

如果是,则www.domain.com用我的自定义 url 替换该部分。

如果没有,那么只需将我的自定义 url 附加到相对 url。

我的问题是,我应该使用什么正则表达式语法以及哪个 php 函数?preg_replace() 是否适合此功能?

干杯

4

1 回答 1

2

You should use internals as opposed to REGEX whenever possible, because often the authors of those functions have considered edge cases (or read the REALLY long RFC for URLs that details all of the cases). For you case, I would use parse_url() and then http_build_url() (note that the latter function needs PECL HTTP, which can be installed by following the docs page for the http package):

$href = 'http://www.domain.com/MainPage.html/SubPage.html';
$parts = parse_url($href);

if($parts['host'] == 'www.domain.com') {
    $parts['host'] = 'www.yoursite.com';

    $href = http_build_url($parts);
}

echo $href; // 'http://www.yoursite.com/MainPage.html/SubPage.html';

Example using your code:

foreach ($doc['a'] as $link) {
    $urlParts = parse_url(pq($link)->attr('href'));               

    $urlParts['host'] = 'www.yoursite.com'; // This replaces the domain if there is one, otherwise it prepends your domain

    $newURL = http_build_url($urlParts);

    pq($link)->attr('href', $newURL);
}
于 2013-05-05T03:05:01.777 回答