1
$domains = file('../../domains.txt');
$keywords = file('../../keywords.txt');

$domains 将采用以下格式:

3kool4u.com,9/29/2013 12:00:00 AM,AUC
3liftdr.com,9/29/2013 12:00:00 AM,AUC
3lionmedia.com,9/29/2013 12:00:00 AM,AUC
3mdprod.com,9/29/2013 12:00:00 AM,AUC
3mdproductions.com,9/29/2013 12:00:00 AM,AUC

关键字将采用以下格式:

keyword1
keyword2
keyword3

我想我真的很想为文件中的关键字创建一个数组,并在 domain.txt 的每一行中搜索匹配项。不确定从哪里开始,因为我对 preg_match、preg_match_all 和 strpos 的区别感到困惑,或多或少何时使用其中一个。

提前感谢您的帮助。

4

1 回答 1

3
//EMPTY array to hold each line on domains that has a match
$matches = array();

//for each line on the domains file
foreach($domains as $domain){

    //for each keyword
    foreach($keywords as $keyword){

          //if the domain line contains the keyword on any position no matter the case
          if(preg_match("/$keyword/i", $domain)) {
                    //Add the domain line to the matches array
            $matches[] = $domain;
          }     
     }   
}

现在您有了 $matches 数组,其中包含与关键字匹配的域文件的所有行

请注意,使用以前的方法,两个整个文件都加载到内存中,并且取决于文件大小,您可能会用完内存,否则操作系统将开始使用比 RAM 慢得多的交换

这是另一种更有效的方法,如果当时的文件将加载一行。

<?php

// Allow automatic detection of line endings
ini_set('auto_detect_line_endings',true);

//Array that will hold the lines that match
$matches = array();

//Opening the two files on read mode
$domains_handle = fopen('../../domains.txt', "r");
$keywords_handle = fopen('../../keywords.txt', "r");

    //Iterate the domains one line at the time
    while (($domains_line = fgets($domains_handle)) !== false) {

        //For each line on the domains file, iterate the kwywords file a line at the time
        while (($keywords_line = fgets($keywords_handle)) !== false) {

              //remove any whitespace or new line from the beginning or the end of string
              $trimmed_keyword = trim($keywords_line);

              //Check if the domain line contains the keyword on any position
              // using case insensitive comparison
              if(preg_match("/$trimmed_keyword/i", trim($domains_line))) {
                    //Add the domain line to the matches array
                $matches[] = $domains_line;
              } 
        }
        //Set the pointer to the beginning of the keywords file
        rewind($keywords_handle);
    }

//Release the resources
fclose($domains_handle);
fclose($keywords_handle);

var_dump($matches);
于 2013-10-01T04:42:40.930 回答