2

我有一些文本文件。例如:file1.txtfile2.txt.

的包含file1.txtWalk word1 in the rain Walking in the rain is one of the most beautiful word2 experiences.

有一些条件:

  1. 如果有word1AND word2,我想得到这两个词之间的文本,$between所以我会得到in the rain Walking in the rain is one of the most beautiful。我也想得到文本word2$content所以我会得到experiences
  2. 如果只有word1OR word2(eg = Walk in the rain Walking in the rain is one of the most beautiful word1 experiences.) 那么$between =''并且$content是所有文本->Walk in the rain Walking in the rain is one of the most beautiful word1 experiences.
  3. 如果word2前面是word1example :Walk in word2 the rain Walking in the rain is one of the most word1 beautiful word1 experiences.那么 $between = '' and$content` 是所有的文本。

这是我的代码:

//to get and open the text files
$txt = glob($savePath.'*.txt');
foreach ($txt as $file => $files) {
    $handle = fopen($files, "r") or die ('can not open file');
    $ori_content = file_get_contents($files);

//count the words of text, to reach until the last word
$words = preg_split('/\s+/',$ori_content ,-1,PREG_SPLIT_NO_EMPTY);
$count = count ($words);

$word1 ='word1';
$word2 ='word2';
    if (stripos($ori_content, $word1) && stripos($ori_content, $word2)){
        $between  = substr($ori_content, stripos($ori_content, $word1)+ strlen($word1), stripos($ori_content, $word2) - stripos($ori_content, $word1)- strlen($word1));
        $content  = substr($ori_content, stripos($ori_content, $word2)+strlen($word2), stripos($ori_content, $ori_content[$count+1])  - stripos($ori_content,$word2));
    }
    else 
    $content = $ori_content;

$q0 = mysql_query("INSERT INTO tb VALUES('','$files','$content','$between')") or die(mysql_error());

但我的代码仍然无法处理:

  1. 条件号 2(上),我得到结果 $between = 经验,它应该是 $between=''
  2. 条件编号 3(上)。我得到结果 $etween = the rain 在雨中行走是 word1 最美丽的 word1 体验之一,应该是 $between=''
  3. 如果我在 file1.txt 中得到 $between,但在 file2.txt 中没有,在数据库中的表之间,对于数据 file2.txt,它应该在之间的列中为空。但它不为空,它由其他文本文件之间填充
  4. 我无法说出最后一句话。

请帮助我..提前谢谢!:)

4

2 回答 2

1

我认为您只是缺少一个声明:

...
}
else {
    $between = '';
    $content = $ori_content;
}

您可能在循环中使用它,因此如果您没有明确设置$between为空字符串,您将获得前一个循环的值:)

编辑

您还忘记了比较职位:

if (stripos($ori_content, $word1) && stripos($ori_content, $word2)){

应该:

$pos1 = stripos($ori_content, $word1);
$pos2 = stripos($ori_content, $word2);
if (false !== $pos1 && false !== $pos2 && $pos1 < $pos2) {

编辑 2

另一件事; 您的 SQL 容易被注入,并且您不能以NULL这种方式正确使用该值。您可以使用这种构造,但最好使用PDOor mysqli

$sql_between = is_null($between) ? 'NULL' : "'" . mysql_real_escape_string($between) . "'";
// apply the same treatment for `$files`, etc.
...
mysql_query("INSERT INTO tb VALUES('', $sql_files, $sql_content, $sql_between)");

通过这种方式,您可以设置$betweennull使其正确发送到 MySQL。

于 2012-09-25T09:47:27.967 回答
1

我已经将解析器逻辑包装成一个函数parse_content

$txt = glob($savePath.'*.txt');
foreach ($txt as $file => $files) {
    $handle = fopen($files, "r") or die ('can not open file');
    $ori_content = file_get_contents($files);
    $word1 ='word1';
    $word2 ='word2';

    $result = parse_content($word1, $word2, $ori_content);
    extract($result);

    $q0 = mysql_query("INSERT INTO tb VALUES('','$files','$content','$between')") or die(mysql_error());

}


function parse_content($word1, $word2, $input) {
    $between = '';
    $content = '';

    $w1 = stripos($input, $word1);
    $w2 = stripos($input, $word2);

    if($w1 && $w2) {
        if($w2 < $w1) {
            // Case 3
            $content = $input;
        } else {
            // Case 1
            $reg_between = '/' . $word1 . '(.*?)' . $word2 . '/';
            $reg_content = '/' . $word2 . '(.*)$/';

            preg_match($reg_between, $input, $match);
            $between = trim($match[1]);
            preg_match($reg_content, $input, $match);
            $content = trim($match[1]);
        }
    } else if($w1 || $w2) {
        // Case 2
        $content = $input;
    } else {
        // Case 4
        $content = $input;
    }

    return compact('between', 'content');
}
于 2012-09-25T09:49:26.460 回答