1

我正在编写一个小函数来从完整的联系人姓名字段中删除常用标题。这是我到目前为止所拥有的:

string[] CommonTitles = new string[] { "MR ", "MRS ", "MS ", "MISS ", "DR ", "HERR ", "MONSIEUR ", "HR ", "FRAU ", "A V M ", "ADMIRAAL ", 
                "ADMIRAL ", "ALDERMAN ", "ALHAJI ", "AMBASSADOR ", "BARON ", "BARONES ", "BRIG ", "BRIGADIER ", "BROTHER ", "CANON ", "CAPT ", "CAPTAIN ", 
                "CARDINAL ", "CDR ", "CHIEF ", "CIK ", "CMDR ", "COL ", "COLONEL ", "COMMANDANT ", "COMMANDER ", "COMMISSIONER ", "COMMODORE ", "COMTE ", 
                "COMTESSA ", "CONGRESSMAN ", "CONSEILLER ", "CONSUL ", "CONTE ", "CONTESSA ", "CORPORAL ", "COUNCILLOR ", "COUNT ", "COUNTESS ", "AIR CDRE ", 
                "AIR COMMODORE ", "AIR MARSHAL ", "AIR VICE MARSHAL ", "BRIG GEN ", "BRIG GENERAL ", "BRIGADIER GENERAL ", "CROWN PRINCE ", "CROWN PRINCESS ", 
                "DAME ", "DATIN ", "DATO ", "DATUK ", "DATUK SERI ", "DEACON ", "DEACONESS ", "DEAN ", "DHR ", "DIPL ING ", "DOCTOR ", "DOTT ", "DOTT SA ", 
                "DR ", "DR ING ", "DRA ", "DRS ", "EMBAJADOR ", "EMBAJADORA ", "EN ", "ENCIK ", "ENG ", "EUR ING ", "EXMA SRA ", "EXMO SR ", "F O ", 
                "FATHER ", "FIRST LIEUTIENT ", "FIRST OFFICER ", "FLT LIEUT ", "FLYING OFFICER ", "FR ", "FRAU ", "FRAULEIN ", "FRU ", "GEN ", "GENERAAL ", 
                "GENERAL ", "GOVERNOR ", "GRAAF ", "GRAVIN ", "GROUP CAPTAIN ", "GRP CAPT ", "H E DR ", "H H ", "H M ", "H R H ", "HAJAH ", "HAJI ", 
                "HAJIM ", "HER HIGHNESS ", "HER MAJESTY ", "HERR ", "HIGH CHIEF ", "HIS HIGHNESS ", "HIS HOLINESS ", "HIS MAJESTY ", "HON ", "HR ", 
                "HRA ", "ING ", "IR ", "JONKHEER ", "JUDGE ", "JUSTICE ", "KHUN YING ", "KOLONEL ", "LADY ", "LCDA ", "LIC ", "LIEUT ", "LIEUT CDR ", 
                "LIEUT COL ", "LIEUT GEN ", "LORD ", "MADAME ", "MADEMOISELLE ", "MAJ GEN ", "MAJOR ", "MASTER ", "MEVROUW ", "MISS ", "MLLE ", "MME ", 
                "MONSIEUR ", "MONSIGNOR ", "MSTR ", "NTI ", "PASTOR ", "PRESIDENT ", "PRINCE ", "PRINCESS ", "PRINCESSE ", "PRINSES ", "PROF ", 
                "PROF DR ", "PROF SIR ", "PROFESSOR ", "PUAN ", "PUAN SRI ", "RABBI ", "REAR ADMIRAL ", "REV ", "REV CANON ", "REV DR ", "REV MOTHER ", 
                "REVEREND ", "RVA ", "SENATOR ", "SERGEANT ", "SHEIKH ", "SHEIKHA ", "SIG ", "SIG NA ", "SIG RA ", "SIR ", "SISTER ", "SQN LDR ", "SR ", 
                "SR D ", "SRA ", "SRTA ", "SULTAN ", "TAN SRI ", "TAN SRI DATO ", "TENGKU ", "TEUKU ", "THAN PUYING ", "THE HON DR ", "THE HON JUSTICE ", 
                "THE HON MISS ", "THE HON MR ", "THE HON MRS ", "THE HON MS ", "THE HON SIR ", "THE VERY REV ", "TOH PUAN ", "TUN ", "VICE ADMIRAL ", 
                "VISCOUNT ", "VISCOUNTESS ", "WG CDR " };



            string returnName = textBox1.Text.ToUpper();

            foreach (string title in CommonTitles)
            {
                returnName = returnName.Replace(title, "");
            }

            MessageBox.Show(returnName);

但是,我刚刚尝试使用以下输入进行测试:KHUN YING Abu Dina Mr MRS TOH MAJOR但我得到了回复:KHUN YABU DINA TOH MAJOR

有什么比使用 REPLACE 函数更好的方法吗?

提前感谢您的帮助。

4

1 回答 1

4

您可以使用正则表达式。首先,您必须从所有标题中删除尾随空格。然后您可以使用锚点\b在单词边界上进行匹配。为了避免多余的空格,您还需要匹配标题前面或后面的空格(我在使用之后进行\s*)。您可能仍然有一个尾随空格,因此您还需要Trim()字符串:

var regex = new Regex(@"\b(" + string.Join("|", CommonTitles) + @")\b\s*");
var result = regex.Replace("KHUN YING ABU DINA MR MRS TOH MAJOR", String.Empty).Trim();

这导致:

阿布迪纳托

您还可以让正则表达式处理大小写问题,以避免将所有内容都转换为大写。只需使用RegexOptions.IgnoreCase

var regex = new Regex(
  @"\b(" + string.Join("|", CommonTitles) + @")\b\s*",
  RegexOptions.IgnoreCase
);
var result = regex.Replace("Khun Ying Abu Dina Mr Mrs Toh Major", String.Empty).Trim();

现在结果是:

阿布迪纳托
于 2013-10-21T15:05:42.370 回答