我有一个 XML 文件,我正在将它分解成 SQL Server 2008 中的一个表。它深入几个节点,这是我第一次尝试这样做,所以我确定我在某处遗漏了这一点。
我让它工作了,但是对于一个有 97 条记录的 XML,我得到每条记录 97 次,即结果集中总共有 9409 行!这大约需要 37 秒。如果我使用SELECT DISTINCT
我会得到 97 行,但它(可以预见)也需要 37 秒。
XML 的第一行:
<?xml version="1.0" encoding="utf-8"?>
<trmFileDataStream>
<getTransactionDetails>
<getTransactionDetailsResponse>
<messages>
<resultCode>Ok</resultCode>
<message>
<code>I00001</code>
<text>Successful.</text>
</message>
</messages>
<transaction>
<transId>4570599999</transId>
<submitTimeUTC>2012-08-12T20:52:05.01Z</submitTimeUTC>
<submitTimeLocal>2012-08-12T15:52:05.01</submitTimeLocal>
<transactionType>authCaptureTransaction</transactionType>
<transactionStatus>pendingSettlement</transactionStatus>
<responseCode>1</responseCode>
<responseReasonCode>1</responseReasonCode>
<responseReasonDescription>Approval</responseReasonDescription>
<AVSResponse>P</AVSResponse>
<batch>
<batchId>2007999999</batchId>
<settlementTimeUTC>2012-08-12T21:12:40.193Z</settlementTimeUTC>
<settlementTimeLocal>2012-08-12T16:12:40.193</settlementTimeLocal>
<settlementState>pendingSettlement</settlementState>
</batch>
<authAmount>99.04</authAmount>
<settleAmount>99.04</settleAmount>
<taxExempt>false</taxExempt>
<payment>
<bankAccount>
<routingNumber>XXXXCCCC</routingNumber>
<accountNumber>XXXXNNNN</accountNumber>
<nameOnAccount>Account Name</nameOnAccount>
<echeckType>WEB</echeckType>
</bankAccount>
</payment>
<customer>
<id>UWYN201H7C</id>
</customer>
<billTo>
<firstName>FirstName</firstName>
<lastName>LastName</lastName>
<address>123245 Some street.</address>
<city>Some City</city>
<state>WS</state>
<zip>36123</zip>
<country>USA</country>
<phoneNumber>1234567891</phoneNumber>
</billTo>
<recurringBilling>false</recurringBilling>
</transaction>
</getTransactionDetailsResponse>
我的SELECT
声明:
SELECT distinct
transactions.value ('(transId/text())[1]','varchar(100)') AS transID,
Replace(Replace(transactions.value ('(submitTimeUTC/text())[1]','varchar(100)'),'T',' '),'Z',' ') AS submitTimeUTC,
Replace(transactions.value ('(submitTimeLocal/text())[1]','varchar(100)'),'T',' ') AS submitTimeLocal,
transactions.value ('(transactionType/text())[1]','varchar(100)') AS transactionType,
transactions.value ('(transactionStatus/text())[1]','varchar(100)') AS transactionStatus,
transactions.value ('(responseCode/text())[1]','varchar(100)') AS responseCode,
transactions.value ('(responseReasonCode/text())[1]','varchar(100)') AS responseReasonCode,
transactions.value ('(responseReasonDescription/text())[1]','varchar(100)') AS responseReasonDescription,
transactions.value ('(AVSResponse/text())[1]','varchar(100)') AS AVSResponse,
transactions.value ('(authAmount/text())[1]','decimal(10,2)') AS authAmount,
transactions.value ('(settleAmount/text())[1]','decimal(10,2)') AS settleAmount,
transactions.value ('(taxExempt/text())[1]','varchar(100)') AS taxExempt,
transactions.value ('(recurringBilling/text())[1]','varchar(100)') AS recurringBilling,
rootb.value ('(fileInformationLine/text())[1]','varchar(100)') AS FileInfo,
messagesb.value ('(resultCode/text())[1]','varchar(100)') AS resultCode,
messageb.value ('(code/text())[1]','varchar(100)') AS MsgCode,
messageb.value ('(text/text())[1]','varchar(100)') AS MsgText,
batch.value ('(batchId/text())[1]','varchar(100)') AS batchID,
batch.value ('(settlementTimeUTC/text())[1]','varchar(100)') AS settlementTimeUTC,
batch.value ('(settlementTimeLocal/text())[1]','varchar(100)') AS settlementTimeLocal,
batch.value ('(settlementState/text())[1]','varchar(100)') AS settlementState,
bankacc.value ('(routingNumber/text())[1]','varchar(100)') AS routingNumber,
bankacc.value ('(accountNumber/text())[1]','varchar(100)') AS accountNumber,
bankacc.value ('(nameOnAccount/text())[1]','varchar(100)') AS nameOnAccount,
bankacc.value ('(echeckType/text())[1]','varchar(100)') AS echeckType,
Customer.value ('(id/text())[1]','varchar(100)') AS customerID,
billTo.value ('(firstName/text())[1]','varchar(100)') AS firstName,
billTo.value ('(lastName/text())[1]','varchar(100)') AS lastName,
billTo.value ('(address/text())[1]','varchar(100)') AS address,
billTo.value ('(city/text())[1]','varchar(100)') AS city,
billTo.value ('(state/text())[1]','varchar(100)') AS state,
billTo.value ('(zip/text())[1]','varchar(100)') AS zip,
billTo.value ('(country/text())[1]','varchar(100)') AS country,
billTo.value ('(phoneNumber/text())[1]','varchar(100)') AS phoneNumber
FROM
xmlImportTempTable
/* Message branch */
CROSS APPLY
xml_data.nodes('//trmFileDataStream/getTransactionDetails/getTransactionDetailsResponse') AS tMsg(getTD)
OUTER APPLY getTD.nodes('messages') AS getTD(messagesb)
OUTER APPLY messagesb.nodes('message') AS messagesb(messageb)
/* Transaction branches */
CROSS APPLY xml_data.nodes('//trmFileDataStream') AS Txn(rootb)
OUTER APPLY rootb.nodes('getTransactionDetails') AS rootb(getTransDetl)
OUTER APPLY getTransDetl.nodes('getTransactionDetailsResponse') AS rootc(getTransDtlResp)
OUTER APPLY getTransDtlResp.nodes('transaction') AS rootd(transactions)
OUTER APPLY transactions.nodes('batch') AS btc(batch)
OUTER APPLY transactions.nodes('payment') AS pmt(payment)
OUTER APPLY payment.nodes('bankAccount') AS bacc(bankacc)
OUTER APPLY transactions.nodes('customer') AS cust(customer)
OUTER APPLY transactions.nodes('billTo') AS billing(billTo)
ORDER BY transID
我可以忍受性能(这将是持续的导入,xml 文件可能比这个大 6-7 倍),因为我可以提取正确的结果集。但我真的很想能够弄清楚如何正确地做到这一点!我刚刚在这里(http://stackoverflow.com/questions/61233/the-best-way-to-shred-xml-data-into-sql-server-database-columns)读到,添加模式将极大地帮助性能,我明天会试试。多行对我来说是一个令人困惑的问题。
提前谢谢了:)
Arnor Baldvinsson,Icetips Alta LLC