0

我正在使用Microsoft.Office.Interop.Word.DocumentVisual Studio 库读取一个 word 文件。问题是该文件包含特殊字符,如 ρ、λ。当我用 C# 读取它们时,它们被转换为问号。
例如,我正在阅读这样的一行, A child drinks a liquid of density ρ through a vertical straw.
所以这行被转换为A child drinks a liquid of density ? through a vertical straw. 所以请帮助我如何将它们保存为原始形式。


这是代码

   public void ReadMsWord()
    {
        // variable to store file path
        string filePath = null;
        // open dialog box to select file
        OpenFileDialog file = new OpenFileDialog();
        // dilog box title name
        file.Title = "Word File";
        // set initial directory of computer system
        file.InitialDirectory = "c:\\";
        // set restore directory
        file.RestoreDirectory = true;

        // execute if block when dialog result box click ok button
        if (file.ShowDialog() == DialogResult.OK)
        {
            // store selected file path
            filePath = file.FileName.ToString();
        }
        Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.ApplicationClass();
        // create object of missing value
        object miss = System.Reflection.Missing.Value;
        // create object of selected file path
        object path = filePath;
        // set file path mode
        object readOnly = false;
        // open document                
        Microsoft.Office.Interop.Word.Document docs = word.Documents.Open(ref path, ref   
        miss, ref readOnly, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss,  
        ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss);

        try
        {

            // create word application


            // select whole data from active window document
            docs.ActiveWindow.Selection.WholeStory();
            // handover the data to cllipboard
            docs.ActiveWindow.Selection.Copy();
            // clipboard create reference of idataobject interface which transfer the 
            data
            IDataObject data = Clipboard.GetDataObject();
            //set data into richtextbox control in text format
            string  t = "";
            string[] y = {};
            t = data.GetData(DataFormats.Text).ToString();
            string[] options = { };
            y = t.Split('\n');
           }
    catch(Exception ex)
    {
              throw ex;
    }
          }
4

1 回答 1

2

利用

t = data.GetData(DataFormats.UnicodeText).ToString();

UnicodeText代替Text. 请注意,特殊字符仍会像?在控制台窗口中一样显示,但它们会在例如 MessageBox.Show 或调试器中正确显示。

于 2013-06-05T07:53:48.157 回答