0

我想解析文本。有一个奇怪的句子,就像B R I E F I N G S I N B I O I N F O R M A T I C S我想跳过那个句子一样。这是代码

<?php
$text = 'B R I E F I N G S I N B I O I N F O R M A T I C S. Because many biomedical entities have multiple names and abbreviations, it would be advantageous to have an automated means to collect these synonyms and abbreviations to aid users doing literature searches.';

$reg = '/(?<=[.!?]|[.!?][\'"])\s+/';
foreach(preg_split($reg, $text, -1, PREG_SPLIT_NO_EMPTY) as $sentence){
    foreach(preg_split('/\s+/', $sentence) as $words){
       if (count(strlen($words)>1)){
        //I don't know what to do
    }
    }
}
?>

但是,它仍然是错误的,如何识别模式句B R I E F I N G S I N B I O I N F O R M A T I C S?谢谢你

4

3 回答 3

1

那这个呢?如果句子中所有单词的长度等于 1,则此方法有效。

   <?php
    $text = 'B R I E F I N G S I N B I O I N F O R M A T I C S. Because many biomedical entities have multiple names and abbreviations, it would be advantageous to have an automated means to collect these synonyms and abbreviations to aid users doing literature searches.';

$reg = '/(?<=[.!?]|[.!?][\'"])\s+/';
foreach(preg_split($reg, $text, -1, PREG_SPLIT_NO_EMPTY) as $sentence){
    foreach(preg_split('/\s+/', $sentence) as $words){
       $isStrange = true;
       if (strlen($words)>1){
        $isStrange = false;
    }
    if ($isStrange) echo $sentence.' is very strange!';
    }
}
?>
于 2012-11-01T09:51:10.787 回答
1

只要字符串每次都相同,这将起作用

<?php
$text = 'B R I E F I N G S I N B I O I N F O R M A T I C S. Because many biomedical entities have multiple names and abbreviations, it would be advantageous to have an automated means to collect these synonyms and abbreviations to aid users doing literature searches.';

$text = str_replace("B R I E F I N G S I N B I O I N F O R M A T I C S. ","",$text); // <--- added this

$reg = '/(?<=[.!?]|[.!?][\'"])\s+/';
foreach(preg_split($reg, $text, -1, PREG_SPLIT_NO_EMPTY) as $sentence){
    foreach(preg_split('/\s+/', $sentence) as $words){
       if (count(strlen($words)>1)){
        //I don't know what to do
    }
    }
}
?>
于 2012-11-01T09:51:43.297 回答
1

从您显示的句子中,我将在您的文本开头删除仅包含空格注入的大写字母的句子:

echo preg_replace('/^[A-Z](?:\s[A-Z])+\./', '', $text);
于 2012-11-01T10:00:21.450 回答