3 回答
To do it without a lookup table, you could take advantage of Unicode normalisation.
If you normalise a letter that has a diacritical mark (including Japanese voiced marks) to Normal Form D, you'll get a decomposed base letter and combining diacritical. Just take the first of those characters and you've got what you want.
name.Normalize(NormalizationForm.FormD).Substring(0, 1)
Dim x As New List(Of Char) ''containing the chars to be removed
For Each y In x
queryresult.Replace(y, "")
Next
If you know which is which:
Dim phonetics As New List(Of Char)
Dim actuals As New List(Of Char)
For i = 0 To phonetics.Count - 1
queryresult.Replace(phonetics(i), actuals(i))
Next
Another way is:
Dim actual As String = queryresult.Split(" ")(0)
OK, I figured out a way. Katakanas can be converted to single byte charatacters, which results in a single byte kana and optionally a second single byte for the modifier ("dakuten" in Japanese).
So, modified kanas can be elimited from the list by doing a check like this (in VB
)
if StrConv(StrConv(kana, VbStrConv.Katakana), VbStrConv.Narrow).Length > 1 then
...
end if
Any string that has a length greater than one is a modifed sound.
It gets trickier in C#
for the conversion, but the principle would be the same. This does not work with hiraganas, because they cannot be half-width, so t is important to first do the katakana conversion.