问问题
137 次
1 回答
2
You need to match diacritic marks after base letters using \p{M}*
:
'~\b(?<!\p{M})\p{L}\p{M}*\.~u'
The pattern matches
\b
- a word boundary(?<!\p{M})
- the char before the current position must not be a diacritic char (without it, a match can occur within a single word)\p{L}
- any base Unicode letter\p{M}*
- 0+ diacritic marks\.
- a dot.
See the PHP demo online:
$s = "क. ಕ. के. ಕೆ. ";
echo preg_replace('~\b(?<!\p{M})\p{L}\p{M}*+\.~u', '<pre>$0</pre>', $s);
// => <pre>क.</pre> <pre>ಕ.</pre> <pre>के.</pre> <pre>ಕೆ.</pre>
于 2019-01-14T09:39:00.730 回答