1

我的数据库中有一堆字符串,如下所示:

下班开车回家。狗从沙发上一跃而起,来到门口的大主人面前。他把脸舔干净。

字符串从句子中间开始。我想找到一种方法来切断最初不完整的句子,然后从“狗从沙发上跳到门口的伟大主人那里。他把脸舔干净了。”

我该怎么做?

4

4 回答 4

1

问题是如何定义不完整的句子。我们可以假设所有以大写字符开头的句子都是完整的句子。如果是这样,代码可能看起来像这样

str = 'driving home from work. The dog leaped of the sofa to great his master at the door. He licked his face clean.'
sentences = str.split('.')
sentences.shift if sentences[0][0].downcase == sentences[0][0]
sentences.join('.').strip << '.'

有点棘手但有效。

于 2012-09-18T07:36:39.893 回答
1

最简单的答案:

str = 'driving home from work. The dog leaped of the sofa to great his master at the door. He licked his face clean.'
str.sub!(/^[^A-Z].+?\./, '').strip!
于 2012-09-18T07:43:39.750 回答
0

https://github.com/ged/linkparser

这可能会有所帮助。

于 2012-09-18T07:42:02.403 回答
0

可能是这样的?

str = "driving home from work. The dog leaped of the sofa to great his master at the door. He licked his face clean."
str.first == str.first.upcase ? str : str.split(".")[1..-1].join(".").lstrip << "."

假设它以大写字母开头表示句子的开头,否则是不可能的。其他要考虑的情况,如果它以数字开头怎么办?例如:1 条狗跑掉了。狗……是1狗……一句话?

于 2012-09-18T07:42:13.790 回答