0

我正在尝试清理一些特殊字符的垃圾数据(允许一些),但有些仍然可以通过。我之前发现了一个正则表达式片段,但没有删除一些字符,比如星号。

  $clean_body = $raw_text;

  $clean_title = preg_replace("/[^!&\/A-Za-z0-9_ ]/","", $clean_body);
  $clean_title = substr($clean_title, 0, 64);

  $clean_body = nl2br($clean_body);  

  if ($nid) {
    $node = node_load($nid);
    unset($node->field_category);
  } else {
    $node = new stdClass();
    $node->type = 'article';
    node_object_prepare($node); 
  }

  $split_title = str_split($clean_title);

  foreach ($split_title as $key => $character) {
    if ($key > 15) {
      if ($character == ' ' && !preg_match("/[^!&\/,.-]/", $split_title[$key - 1])) {
        $node->title = html_entity_decode(substr(strip_tags($clean_title), 0, $key - 1)) . '...';
      }
    }
  }

第一部分尝试清除原始文本中不是正常标点符号或字母数字的任何内容。然后,我将标题拆分为一个数组并寻找一个空格。我想要做的是创建一个至少 15 个字符长的标题,并在空格处截断(保持整个单词完整)而不在标点符号处停止。这是我遇到麻烦的部分。

例如,当第一个标题甚至不应该有's 时,某些标题仍然显示为*****************or ,并且该部分应该是。** HOW TO MAKE $$$$$$ BLOGGING ***HOW TO MAKE...

4

2 回答 2

0

Your problem (or, one of them anyhow) is this logic:

if ($key > 15) {
  if ($character == ' ' && !preg_match("/[^!&\/,.-]/", $split_title[$key - 1])) {
    $node->title = html_entity_decode(substr(strip_tags($clean_title), 0, $key - 1)) . '...';
  }
}

You're only setting $node->title if these conditions match when iterating the characters in the $split_title array.

What happens when they don't match? $node->title doesn't get set (or overwritten? You didn't give much context, so I can't tell).

Using this as a test:

$clean_body = '** HOW TO MAKE $$$$$$ BLOGGING **';

You can see that these conditions do not match, so $node->title does not get set (or overwritten).

于 2011-09-13T17:04:09.567 回答
0

怎么样"/[^!&\/\w\s]/ui"?在我的机器上工作正常

于 2011-09-13T16:41:20.430 回答