0

我有由一些句子组成的文本。我必须解析用点分隔的句子并计算每个句子中的单词。超过 5 个单词的句子将被插入数据库。这是我的代码:

<?php

require_once 'conf/conf.php';// connect to database

function saveContent ($text) {
  //I have to get every sentence without lose the dot
  $text1 = str_replace('.', ".dot", $text);
  $text2 = explode ('dot',$text1); 

  //Text that contain ' cannot be inserted to database, so i need to remove it 
  $text3 = str_replace("'", "", $text2); 

  //Selecting the sentence that only consist of more than words
  for ($i=0;$i<count($text3);$i++){
    if(count(explode(" ", $text3[$i]))>5){
      $save = $text3[$i];

      $q0 = mysql_query("INSERT INTO tbdocument VALUES('','$files','".$save."','','','') ");
    }
  }
}

$text= "I have some text files in my folder. I get them from extraction process of pdf journals files into txt files. here's my code";
$a = saveContent($text);

?>

结果只有1句(第一句)可以插入数据库。我需要你的帮助,非常感谢:)

4

1 回答 1

0

有很多方法可以改善这一点(并使其正常工作)。

而不是替换..dot,您可以简单地爆炸.并记住稍后替换它。但是,如果您的判决类似于史密斯先生去了华盛顿怎么办。? 您无法以非常可靠的方式区分这些时期。

$files您的变量INSERT未在此函数的范围内定义。我们不知道它来自哪里或您期望它包含什么,但在这里,它将为 NULL。

function saveContent ($text) {
  // Just explode on the . and replace it later...
  $sentences = explode(".", $text);

  // Don't remove single quotes. They'll be properly escaped later...

  // Rather than an incremental loop, use a proper foreach loop:
  foreach ($sentences as $sentence) {
    // Using preg_split() instead of explode() in case there are multiple spaces in sequence
    if (count(preg_split('/\s+/', $sentence)) > 5) {
      // Escape and insert
      // And add the . back onto it
      $save = mysql_real_escape_string($sentence) . ".";

      // $files is not defined in scope of this function!
      $q = mysql_query("INSERT INTO tbdocument VALUES('', '$files', '$sentence', '', '', '')");
      // Don't forget to check for errors.
      if (!$q) {
        echo mysql_error();
      }
    }
  }
}

从长远来看,考虑远离mysql_*()函数并开始学习支持预准备语句的 API,例如 PDO 或 MySQLi。旧mysql_*()功能很快就会被弃用,并且缺乏准备好的语句提供的安全性。

于 2012-06-27T03:05:10.887 回答