3

此代码打开一个文件夹中的所有 excel 文件,然后它获取打开的文件中的所有电子邮件并将它们放入一个数组中。最后,我需要来自所有数组数组的所有内容的一个大数组。我需要它是来自所有文件的所有电子邮件的一大数组。

下面的代码不起作用。我相信这是一个简单的。谢谢

<?

$Folder = "sjc/";
$files = scandir($Folder);


function cleanFolder($file)
{
$string = file_get_contents("sjc/$file");
$pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);

$Emails[] = $matches[0];
return $Emails;
}



function beginClean($files)
{
    for($i=0; count($files)>$i;$i++)
        {
        $Emails = cleanFolder("$files[$i]");
        $TheEmails .= explode(",",$Emails);

        }

/// Supposed to be a big string of emails separated by comma
echo $TheEmails; // But it just echos .... ArrayArrayArrayArrayArray etc...

// WHAT I REALLY WANT IS.. one Array holding all emails, not an Array of Arrays. 
}

beginClean($files);

?>

更新:开始工作了。但是我现在遇到了内存问题,因为电子邮件总数超过 229911。

致命错误:第 33 行 /home/public_html/StatuesPlus/CleanListFolder.php 中允许的 67108864 字节内存大小已用尽(尝试分配 71 字节)

这是有效的代码:

<?

$Folder = "sjc/";
$files = scandir($Folder);


function cleanFolder($file)
{
//echo "FILE NAME " . $file . "<br>";
$string = file_get_contents("sjc/$file");
$pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);

$TheEmails .= implode(',', $matches[0]);
return $TheEmails;

}



function beginClean($files)
{
    for($i=0; count($files)>$i;$i++)
        {
        $Emails .= cleanFolder("$files[$i]");
        }



$TheEmails = explode(",", $Emails);
//$UniqueEmails= array_unique($TheEmails);
echo count($TheEmails);
//file_put_contents("Emails.txt", $TheEmails);
}

beginClean($files);

?>
4

2 回答 2

2

.=用于连接字符串,而不是数组。但是您可以将它们作为字符串保留一段时间:

$TheEmails .= ",$Emails";

接着:

$TheEmails = explode(',', substr($TheEmails, 1));
于 2013-05-21T01:50:23.733 回答
1

下面是我用来从任何给定文件夹中的多个 Excel 表中收集多封电子邮件的最终代码。这些文件可以是 CSV、XLS、XLSX、HTML 等。此代码将从该文件夹中的多个页面提取电子邮件并将它们放入一个巨大的数组中。:)

<?
    // See below for ARRAY out put called $FinalEmails 

    // SET YOUR FOLDER HERE

    $Folder = "sjc/";
    $files = scandir($Folder);


    function cleanFolder($file)
    {

    $string = file_get_contents("$Folder/$file");
    $pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
    preg_match_all($pattern, $string, $matches);

    $TheEmails .= implode(',', $matches[0]);
    $TheEmails = strtolower($TheEmails);

    return $TheEmails;

    }



    function beginClean($files)
    {
        for($i=0; count($files)>$i;$i++)
            {
            $Emails .= cleanFolder("$files[$i]");
            }



    $TheEmails = explode(",", $Emails);
    $UniqueEmails= array_unique($TheEmails);

    $Emails = implode(",", $UniqueEmails);


    function isValidEmail($email)

    {  
     return filter_var(filter_var($email, FILTER_SANITIZE_EMAIL), FILTER_VALIDATE_EMAIL);  
    }  


    for($i=0; count($UniqueEmails)>$i;$i++)
    {
        if(isValidEmail("$UniqueEmails[$i]"))
        {  
        echo $UniqueEmails[$i] . "<br>";
        $FinalEmails .= "$UniqueEmails[$i],";
        } 
    else 
        {  
        //not valid  
        }
    }


    /// An ARRAY OF Emails from multiple Excel Sheeet Cleaned
    // Cleaned of duplicates and checked if a valid email.
    $FinalEmails = explode(",", $FinalEmails);



    }

    beginClean($files);

    ?>
于 2013-05-21T05:00:07.220 回答