-1

这是我拥有的数组

var_dump($arr);
// prints below array
[0] => Array
        (
            [title] => Lee Daniels' The Butler (2013)
        )

我想删除年份和大括号并将空格(“”)替换为下划线(“_”),然后对其进行 urlencode。因此,期望的输出是

Lee_Daniels%27_The_Butler

这是我的代码:

$url = preg_replace('/\((\d){4}\)/', '', $arr[0]['title']);

$title = str_replace(" ","_", trim($url));
$title = urlencode($title); // tried with urlencode(addslashes($title));
echo $title; // prints Lee_Daniels'_The_Butler

我知道 echo urlencode('\'') 给出了 "%27" ,因此尝试了 addlashes 但无济于事。

更新: 它适用于

preg_replace('/\((\d){4}\)/', '', "Lee Daniels' The Butler (2013)");

但是,如果您直接获取 str 如下:

include_once('simple_html_dom.php');

$url = 'http://www.imdb.com/chart/';
$main_content = file_get_html($url);

$table = $main_content->find('table', 0);
$tbody = $table->find('tbody', 0);

$trs = $tbody->find('tr');
foreach ($trs as $tr) {
    $tds = $tr->find('td');
    $movies = "";

    $movies['title'] = trim($tds[2]->plaintext);

    $arr[] = $movies;
}

$url = preg_replace('/\((\d){4}\)/', '', $arr[0]['title']);

$title = str_replace(" ","_", trim($url));
$title = urlencode($title);
echo $title;

要复制这一点,请在 php.ini 中包含简单的 html dom 解析器。

有人可以指出我所缺少的吗?

4

1 回答 1

0

这是工作代码:

include_once('simple_html_dom.php');

$url = 'http://www.imdb.com/chart/';
$main_content = file_get_html($url);

$table = $main_content->find('table', 0);
$tbody = $table->find('tbody', 0);

$trs = $tbody->find('tr');
foreach ($trs as $tr) {
    $tds = $tr->find('td');
    $movies = "";

    $movies['title'] = trim($tds[2]->plaintext);

    $arr[] = $movies;
}

$title = html_entity_decode($arr[0]['title'], ENT_QUOTES, 'UTF-8');
$title = trim(preg_replace('/\((\d){4}\)/', '', $title));
$title = str_replace(" ", "_", $title);
$title = urlencode($title);

echo $title;

请注意,屏幕抓取违反此处提到的 IMDB 条件。这仅用于学习目的。

于 2013-09-03T05:21:47.657 回答