-2

我有 .doc、.txt、.docx 的字节数据,我想将其转换为字符串,我做了以下事情但没有得到准确的结果:

Public ByteData As Byte() = // my data
Dim str As String = String.Empty

str = System.Text.Encoding.UTF8.GetString(objCandidateInfo.ByteData, 0, objCandidateInfo.ByteData.Length)
str = Convert.ToBase64String(objCandidateInfo.ByteData)

已编辑

所以现在我正在使用 Word 应用程序进行相同的转换,这段代码正在工作这是我的代码

 Private Shared ObjwordApp As Word.Application
    Private Shared nullobj As Object = System.Reflection.Missing.Value
    Private Shared doc As Word.Document
    Shared Sub New()
        ObjwordApp = New Word.Application()
    End Sub

    Public Shared Sub InitializeClass()
        ObjwordApp.Visible = False
    End Sub

    Private Shared Sub OpenWordFile(ByVal StrFilePath As Object)
        Try
            ObjwordApp.Visible = False
        Catch ex As Exception
            ObjwordApp = New Word.Application()
        End Try
        Try
            doc = ObjwordApp.Documents.Open(StrFilePath, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj)
        Catch ex As Exception
            CloseWordFile()
            ObjwordApp.Visible = False
        End Try
    End Sub

    Private Shared Sub CopyWordContent()
        Try
            doc.ActiveWindow.Selection.WholeStory()
            doc.ActiveWindow.Selection.Copy()
        Catch ex As Exception
            Clipboard.Clear()
        End Try
    End Sub

    Private Shared Sub CloseWordFile()
        Try
            doc.Close()
        Catch ex As Exception

        End Try
    End Sub

    Public Shared Function ReadWordFile(ByVal StrFilePath As String, ByVal StrDataFormat As String) As String
        Dim StrFileContent = String.Empty
        If (File.Exists(StrFilePath)) Then
            Try
                OpenWordFile(StrFilePath)
                CopyWordContent()
            Catch ex As Exception

            Finally
                CloseWordFile()
            End Try

            Try
                Dim dataObj As IDataObject = Clipboard.GetDataObject()
                If (dataObj.GetDataPresent(StrDataFormat)) Then
                    StrFileContent = dataObj.GetData(StrDataFormat)
                Else
                    StrFileContent = ""
                End If
                Clipboard.Clear()
            Catch ex As Exception

            End Try
        End If
        Return StrFileContent
    End Function

当我将字节数组保存到数据库时,我调用下面的函数并将其转换为 rtf,它没有转换,当我将调试器附加到它dataObjNothing

代码 1

Dim str As String = String.Empty
                Try
                    'str = System.Text.Encoding.UTF8.GetString(objCandidateInfo.ByteData, 0, objCandidateInfo.ByteData.Length)
                    'str = Convert.ToBase64String(objCandidateInfo.ByteData)
                    'str = System.Text.Encoding.ASCII.GetString(objCandidateInfo.ByteData, 0, objCandidateInfo.ByteData.Length)
                    str = ClsDocumentManager.ReadContent(objCandidateInfo.ByteData, DataFormats.Rtf)
                Catch ex As Exception

                End Try

我以字节和文本格式保存数据db,所以当我从db(我保存并将其转换为rtf的字节值)调用它时,它的工作代码是

代码 2

rtbAttachment.Rtf = ClsDocumentManager.ReadContent(byteAttachment, DataFormats.Rtf)

这些是ClsDocumentManager类中的方法

Public Shared Function GetRandomNo() As Integer
        Dim RandomNo As New Random()
        Return RandomNo.Next(Convert.ToInt32(DateTime.Now().Minute.ToString() & DateTime.Now().Second.ToString() & DateTime.Now().Hour.ToString()))
    End Function

    Public Shared Function ReadContent(ByVal byteArray As Byte(), ByVal StrReadFormat As String) As String
        Dim StrFileContent As String = String.Empty
        Try
            If (Not IsNothing(byteArray)) Then
                Dim StrFileName As String = GetRandomNo().ToString() & ".doc"
                StrFileName = ClsSingleton.aTempFolderName & StrFileName
                If (CreateWordFile(byteArray, StrFileName)) Then
                    StrFileContent = ClsWordManager.ReadWordFile(StrFileName, StrReadFormat)
                    If (File.Exists(StrFileName)) Then
                        File.Delete(StrFileName)
                    End If
                End If
            End If
        Catch ex As Exception

        End Try
        Return StrFileContent
    End Function

Public Shared Function CreateWordFile(ByVal byteArray As Byte(), ByVal StrFileName As String) As Boolean
        Dim boolResult As Boolean = False
        Try
            If (Not IsNothing(byteArray)) Then
                If (Not File.Exists(StrFileName)) Then
                    Dim objFileStream As New FileStream(StrFileName, FileMode.Create, FileAccess.Write)
                    objFileStream.Write(byteArray, 0, byteArray.Length)
                    objFileStream.Close()
                    boolResult = True
                End If
            End If
        Catch ex As Exception
            boolResult = False
        End Try
        Return boolResult
    End Function

调试时的错误代码

Dim dataObj As IDataObject = Clipboard.GetDataObject()
                If (dataObj.GetDataPresent(StrDataFormat)) Then
                    StrFileContent = dataObj.GetData(StrDataFormat)
                Else
                    StrFileContent = ""
                End If

`dataObj` is `Nothing` only when calling from **Code 1** 

更新

**`ClsDocumentManager`**



Imports System.IO

Public Class ClsDocumentManager
    Public Shared Function GetRandomNo() As Integer
        Dim RandomNo As New Random()
        Return RandomNo.Next(Convert.ToInt32(DateTime.Now().Minute.ToString() & DateTime.Now().Second.ToString() & DateTime.Now().Hour.ToString()))
    End Function

    Public Shared Function ReadContent(ByVal byteArray As Byte(), ByVal StrReadFormat As String) As String
        Dim StrFileContent As String = String.Empty
        Try
            If (Not IsNothing(byteArray)) Then
                Dim StrFileName As String = GetRandomNo().ToString() & ".doc"
                StrFileName = ClsSingleton.aTempFolderName & StrFileName
                If (CreateWordFile(byteArray, StrFileName)) Then
                    StrFileContent = ClsWordManager.ReadWordFile(StrFileName, StrReadFormat)
                    If (File.Exists(StrFileName)) Then
                        File.Delete(StrFileName)
                    End If
                End If
            End If
        Catch ex As Exception

        End Try
        Return StrFileContent
    End Function


    Public Shared Function CreateWordFile(ByVal byteArray As Byte(), ByVal StrFileName As String) As Boolean
        Dim boolResult As Boolean = False
        Try
            If (Not IsNothing(byteArray)) Then
                If (Not File.Exists(StrFileName)) Then
                    Dim objFileStream As New FileStream(StrFileName, FileMode.Create, FileAccess.Write)
                    objFileStream.Write(byteArray, 0, byteArray.Length)
                    objFileStream.Close()
                    boolResult = True
                End If
            End If
        Catch ex As Exception
            boolResult = False
        End Try
        Return boolResult
    End Function
End Class

这是我的ClsWordManager

Imports System.IO
Imports System.Text

Public Class ClsWordManager
    Private Shared ObjwordApp As Word.Application
    Private Shared nullobj As Object = System.Reflection.Missing.Value
    Private Shared doc As Word.Document
    Shared Sub New()
        ObjwordApp = New Word.Application()
    End Sub

    Public Shared Sub InitializeClass()
        ObjwordApp.Visible = False
    End Sub

    Private Shared Sub OpenWordFile(ByVal StrFilePath As Object)
        Try
            ObjwordApp.Visible = False
        Catch ex As Exception
            ObjwordApp = New Word.Application()
        End Try
        Try
            doc = ObjwordApp.Documents.Open(StrFilePath, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj, nullobj)
        Catch ex As Exception
            CloseWordFile()
            ObjwordApp.Visible = False
        End Try
    End Sub

    Private Shared Sub CopyWordContent()
        Try
            doc.ActiveWindow.Selection.WholeStory()
            doc.ActiveWindow.Selection.Copy()
        Catch ex As Exception
            Clipboard.Clear()
        End Try
    End Sub

    Private Shared Sub CloseWordFile()
        Try
            doc.Close()
        Catch ex As Exception

        End Try
    End Sub

    Public Shared Function ReadWordFile(ByVal StrFilePath As String, ByVal StrDataFormat As String) As String
        Dim StrFileContent = String.Empty
        If (File.Exists(StrFilePath)) Then
            Try
                OpenWordFile(StrFilePath)
                CopyWordContent()
            Catch ex As Exception

            Finally
                CloseWordFile()
            End Try

            Try
                Dim dataObj As IDataObject = Clipboard.GetDataObject()
                If (dataObj.GetDataPresent(StrDataFormat)) Then
                    StrFileContent = dataObj.GetData(StrDataFormat)
                Else
                    StrFileContent = ""
                End If
                Clipboard.Clear()
            Catch ex As Exception

            End Try
        End If
        Return StrFileContent
    End Function



End Class

所以现在的问题是当我在下面的代码中转换它时:看看ByteAttachmets争论,它将字节转换为字符串

Public Function UpdateCandidateAttachment(ByVal CandidateID As Integer, ByVal ByteAttachmets As Byte(), ByVal StrExtension As String) As Integer
        Dim Result As Integer = -1
        Try
            Dim objDataLayer As New ClsDataLayer()
            Dim str As String = Nothing
            Try
                'str = System.Text.Encoding.UTF8.GetString(objCandidateInfo.ByteData, 0, objCandidateInfo.ByteData.Length)
                'str = Convert.ToBase64String(objCandidateInfo.ByteData)
                'str = System.Text.Encoding.ASCII.GetString(objCandidateInfo.ByteData, 0, objCandidateInfo.ByteData.Length)
                str = ClsDocumentManager.ReadContent(ByteAttachmets, DataFormats.Rtf)
            Catch ex As Exception

            End Try
            objDataLayer.AddParameter("@CANDIDATE_ID", CandidateID)
            objDataLayer.AddParameter("@ATTACHMENT_DATA", ByteAttachmets)
            objDataLayer.AddParameter("@CREATED_BY", ClsCommons.IntUserId)
            objDataLayer.AddParameter("@EXTENSION", StrExtension)
            Result = objDataLayer.ExecuteNonQuery("TR_PROC_UpdateCandidateAttachment")
        Catch ex As Exception
            MsgBox(ex.Message)
        End Try
        Return Result
    End Function

当我按属性从以下代码中调用它时:看看objCandidateInfo.ByteData,它不起作用。

Public Function AddUpdateCandidate(ByVal objCandidateInfo As ClsCandidateInfo) As Integer
        Dim Result As Integer = -1
        Try
            If (ClsCommons.IsValidEmail(objCandidateInfo.StrEmail)) Then
                Dim str As String = Nothing
                Try
                    'str = System.Text.Encoding.UTF8.GetString(objCandidateInfo.ByteData, 0, objCandidateInfo.ByteData.Length)
                    'str = Convert.ToBase64String(objCandidateInfo.ByteData)
                    'str = System.Text.Encoding.ASCII.GetString(objCandidateInfo.ByteData, 0, objCandidateInfo.ByteData.Length)
                    Dim byteAttachment As Byte() = objCandidateInfo.ByteData
                    str = ClsDocumentManager.ReadContent(byteAttachment, DataFormats.Rtf)
                Catch ex As Exception

                End Try
                Dim objDataLayer As New ClsDataLayer()
                objDataLayer.AddParameter("@REQUIREMENT_ID", objCandidateInfo.RequirementId)
                objDataLayer.AddParameter("@Candidate_Name", objCandidateInfo.StrCandidateName)
                objDataLayer.AddParameter("@Current_Organization", objCandidateInfo.StrCurrentCompany)
                objDataLayer.AddParameter("@Current_Designation", objCandidateInfo.StrCurrentDesignation)
                If (objCandidateInfo.StrExp.Trim() = "") Then
                    objDataLayer.AddParameter("@Overall_Exp", DBNull.Value)
                Else
                    Dim DecExp As Decimal = -1
                    If (Decimal.TryParse(objCandidateInfo.StrExp, DecExp)) Then
                        objDataLayer.AddParameter("@Overall_Exp", DecExp)
                    Else
                        objDataLayer.AddParameter("@Overall_Exp", DBNull.Value)
                    End If
                End If
                objDataLayer.AddParameter("@Qualification", objCandidateInfo.StrQualification)
                objDataLayer.AddParameter("@Location", objCandidateInfo.StrCurrentLocation)
                objDataLayer.AddParameter("@Current_CTC", objCandidateInfo.StrCurrentCTC)
                objDataLayer.AddParameter("@Expected_CTC", objCandidateInfo.StrExpectedCTC)
                objDataLayer.AddParameter("@Phone_No", objCandidateInfo.StrPhoneNo)
                objDataLayer.AddParameter("@Mobile", objCandidateInfo.StrMobile)
                objDataLayer.AddParameter("@Notice_Period", objCandidateInfo.StrNoticePeriod)
                objDataLayer.AddParameter("@Remarks", objCandidateInfo.StrRemarks)
                If (objCandidateInfo.StrYearofExp.Trim() = "") Then
                    objDataLayer.AddParameter("@Years_of_Experience", DBNull.Value)
                Else
                    Dim DecExp As Decimal = -1
                    If (Decimal.TryParse(objCandidateInfo.StrYearofExp, DecExp)) Then
                        objDataLayer.AddParameter("@Years_of_Experience", DecExp)
                    Else
                        objDataLayer.AddParameter("@Years_of_Experience", DBNull.Value)
                    End If
                End If
                objDataLayer.AddParameter("@Address", objCandidateInfo.StrAddress)

                objDataLayer.AddParameter("@Email", objCandidateInfo.StrEmail)
                If (objCandidateInfo.intIndustry > 0) Then
                    objDataLayer.AddParameter("@Industry", objCandidateInfo.intIndustry)
                Else
                    objDataLayer.AddParameter("@Industry", DBNull.Value)
                End If
                If (objCandidateInfo.intFunctionalArea > 0) Then
                    objDataLayer.AddParameter("@Functional_Area", objCandidateInfo.intFunctionalArea)
                Else
                    objDataLayer.AddParameter("@Functional_Area", DBNull.Value)
                End If
                If (objCandidateInfo.StrDob.Trim() = "") Then
                    objDataLayer.AddParameter("@DOB", DBNull.Value)
                Else
                    Try
                        objDataLayer.AddParameter("@DOB", Convert.ToDateTime(objCandidateInfo.StrDob))
                    Catch ex As Exception
                        objDataLayer.AddParameter("@DOB", DBNull.Value)
                    End Try
                End If
                If (objCandidateInfo.intSourceBy > 0) Then
                    objDataLayer.AddParameter("@Source", objCandidateInfo.intSourceBy)
                Else
                    objDataLayer.AddParameter("@Source", DBNull.Value)
                End If
                objDataLayer.AddParameter("@SKILL_SET", objCandidateInfo.strSkillSet)
                objDataLayer.AddParameter("@ATTACHMENT_DATA", objCandidateInfo.ByteData)
                objDataLayer.AddParameter("@EXTENSION", objCandidateInfo.StrExtension)
                objDataLayer.AddParameter("@CREATED_BY", ClsCommons.IntUserId)

                Result = objDataLayer.ExecuteNonQuery("TR_PROC_AddUpdateFullCandidateData")
            Else
                MsgBox("Data is not extracted, Some Error Occured, please update your software.")
            End If
        Catch ex As Exception
            MsgBox(ex.Message)
        End Try
        Return Result
    End Function

我希望我清除我的查询

4

1 回答 1

1

(在对问题进行了几次更改后进行了编辑。)

如果只想获取文件的文本内容,则需要对文本文件和二进制文件进行不同的处理。如果输入文件格式是基于文本的(.txt、.htm 等),您通常可以将其视为字符串,但您仍然需要知道要使用什么编码。

但是,如果输入文件格式是二进制(如 .doc、.docx 等),则不能直接将字节数组转换为字符串,因为文件内容不仅仅代表文本 - 字节描述布局、格式、以及有关该文件的其他信息。在这种情况下,您需要使用 Word 或其他一些第 3 部分库来为您处理文件数据。

要使用自动化获取 Word 文档的内容,只需创建 的实例Word.Application,打开文档,选择其活动窗口中的所有文本,然后使用该Selection.Text属性将文本转换为字符串。就像是:

oDocument.ActiveWindow.Selection.WholeStory()
sText = oDocument.ActiveWindow.Selection.Text

Selection对象是RangeWord 中的一个实例。这为您提供了文档的简单、未格式化的内容。您可以将其转换为字节数组或将其用作字符串。要将其转换为字节数组,您需要使用编码,因为内存中的字符必须转换为字节。

如果要将内容转换为 RTF 格式,则需要 3rd-part 工具(或自己实现 RTF 格式)- RTF 不是纯文本格式,它具有相当复杂的结构。

您还可以使用 Word 以 RTF 格式保存文档 - 查找Document.SaveAs2()执行此操作的方法。这会将文档以 RTF 格式保存到磁盘。如果您需要数据库中的这些数据,只需读取 .rtf 文件 ( File.ReadAllBytes()),然后将字节保存到数据库中。

于 2013-02-02T05:29:12.527 回答