我想逐行读取文件夹中的一些文本文件。例如 1 txt :
Fast and Effective Text Mining Using Linear-time Document Clustering
Bjornar Larsen WORD2 Chinatsu Aone
SRA International AK, Inc.
4300 Fair Lakes Cow-l Fairfax, VA 22033
{bjornar-larsen, WORD1
我想删除不包含单词 = word
, word2
, word3
, 并且不以点结尾的行.
所以。从示例中,结果将是:
Bjornar Larsen WORD2 Chinatsu Aone
SRA International, Inc.
{bjornar-larsen, WORD1
我很困惑,怎么去掉这条线?这可能吗?或者我们可以用空格替换它们吗?
这是代码:
$url = glob($savePath.'*.txt');
foreach ($url as $file => $files) {
$handle = fopen($files, "r") or die ('can not open file');
$ori_content= file_get_contents($files);
foreach(preg_split("/((\r?\n)|(\r\n?))/", $ori_content) as $buffer){
$pos1 = stripos($buffer, $word1);
$pos2 = stripos($buffer, $word2);
$pos3 = stripos($buffer, $word3);
$last = $str[strlen($buffer)-1];//read the las character
if (true !== $pos1 OR true !== $pos2 OR true !==$pos3 && $last != '.'){
//how to remove
}
}
}
请帮助我,非常感谢你:)