1
4

1 回答 1

2

You need to match diacritic marks after base letters using \p{M}*:

'~\b(?<!\p{M})\p{L}\p{M}*\.~u'

The pattern matches

  • \b - a word boundary
  • (?<!\p{M}) - the char before the current position must not be a diacritic char (without it, a match can occur within a single word)
  • \p{L} - any base Unicode letter
  • \p{M}* - 0+ diacritic marks
  • \. - a dot.

See the PHP demo online:

$s = "क. ಕ. के. ಕೆ. ";
echo preg_replace('~\b(?<!\p{M})\p{L}\p{M}*+\.~u', '<pre>$0</pre>', $s); 
// => <pre>क.</pre> <pre>ಕ.</pre> <pre>के.</pre> <pre>ಕೆ.</pre> 
于 2019-01-14T09:39:00.730 回答