这是一个快速的建议——可能有更好的方法,但我喜欢这个。
首先,一定要“知道”一个词是由什么组成的。让我们假设它仅由字母组成。所有其余的,标点符号或“空格”,都可以视为分隔符。
然后,您的“系统”有两种状态:1)完成一个单词,2)跳过分隔符。
您以自由运行跳过分隔符代码开始您的代码。然后您进入“完成一个单词”状态,您将一直保持到下一个分隔符或整个字符串的结尾(在这种情况下,您退出)。当它发生时,您已经完成了一个单词,因此您将单词计数器增加 1,然后进入“跳过分隔符”状态。循环继续。
伪类 C 代码:
char *str;
/* someone will assign str correctly */
word_count = 0;
state = SKIPPING;
for(c = *str; *str != '\0'; str++)
{
if (state == SKIPPING && can_be_part_of_a_word(c)) {
state = CONSUMING;
/* if you need to accumulate the letters,
here you have to push c somewhere */
}
else if (state == SKIPPING) continue; // unneeded - just to show the logic
else if (state == CONSUMING && can_be_part_of_a_word(c)) {
/* continue accumulating pushing c somewhere
or, if you don't need, ... else if kept as placeholder */
}
else if (state == CONSUMING) {
/* separator found while consuming a word:
the word ended. If you accumulated chars, you can ship
them out as "the word" */
word_count++;
state = SKIPPING;
}
}
// if the state on exit is CONSUMING you need to increment word_count:
// you can rearrange things to avoid this when the loop ends,
// if you don't like it
if (state == CONSUMING) { word_count++; /* plus ship out last word */ }
例如,如果读取的字符在 [A-Za-z_] 中,则函数 can_be_part_of_a_word 返回 true,否则返回 false。
(如果我没有在疲劳的教唆上犯下一些严重的错误,它应该可以工作)