php - 如何解析 url 列表并跳过不需要的项目 (PHP)

Question

我有一个包含很多不同网址的 txt 文件。我想解析列表并跳过一些 url 以获得最终的干净列表。请参阅以下列表的一部分：

http://www.example.com/example1/
http://www.example.com/example2/
http://www.example.com/example3/
http://www.example.com/example4/
http://www.example.com/example.js
http://www.example.com/example.css
http://www.example.com/example1.js?v=123
http://www.example.com/{path}
http://www.example.com/feed/
http://www.example.com/?p=66

我想跳过所有.js或.css或{path}或/feed/或?p=66之类的 url ，然后再次将所有内容输出到 txt 文件中。我想用 PHP 做到这一点。有什么建议吗？

score 1 · Accepted Answer

<?php 

  $list = "http://www.example.com/example1/
http://www.example.com/example2/
http://www.example.com/example3/
http://www.example.com/example4/
http://www.example.com/example.js
http://www.example.com/example.css
http://www.example.com/example1.js?v=123
http://www.example.com/{path}
http://www.example.com/feed/
http://www.example.com/?p=66";

  $arr = preg_split("/[\r\n]+/",$list);

  // check our input array
  print_r($arr);

  $map = array();
  foreach($arr as $v){
    if(!preg_match("/({path}|\.(js|css)|\?p=\d+|\/feed\/)$/",$v)){
      $map[] = $v;
    }
  };

  // check our output array
  print_r($map);

?>

这假设您要匹配不以{path}or.css或.jsor ?p=##（其中 # 是数字）或结尾的网址/feed/。这就是为什么还有/example1.js?v=123被匹配的原因。要使其匹配字符串中的任何位置，而不是仅在末尾，请$从正则表达式的末尾（就在 word 之后feed）删除。

我的控制台输出：

Array
(
    [0] => http://www.example.com/example1/
    [1] => http://www.example.com/example2/
    [2] => http://www.example.com/example3/
    [3] => http://www.example.com/example4/
    [4] => http://www.example.com/example.js
    [5] => http://www.example.com/example.css
    [6] => http://www.example.com/example1.js?v=123
    [7] => http://www.example.com/{path}
    [8] => http://www.example.com/feed/
    [9] => http://www.example.com/?p=66
)
Array
(
    [0] => http://www.example.com/example1/
    [1] => http://www.example.com/example2/
    [2] => http://www.example.com/example3/
    [3] => http://www.example.com/example4/
    [4] => http://www.example.com/example1.js?v=123
)

php - 如何解析 url 列表并跳过不需要的项目 (PHP)

1 回答 1

我的控制台输出：

Related

Reference