1

我有大约 13000 个以 XML 格式格式化的日志文件,我需要将它们全部转换为电子表格\csv 文件。

如您所见,我不是程序员,但我已经尝试过。
我已经编写了一个 powershell 脚本来取出第一个节点并创建一个逗号分隔的字符串,但我坚持要获取最后一个节点,该节点可以包含从没有条目到几十个条目的任何内容。

xml 文件示例:

<?xml version="1.0" encoding="utf-8"?>
<MigrationUserStatus>
  <User>username@domain.com</User>
  <StoreList>
    <EmailMigrationStatus>
      <MigrationStatus value="Success" />
      <FolderList>
        <TotalCount value="6" />
        <SuccessCount value="3" />
        <FailCount value="3" />
        <FailedMessages>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>1601-01-01T00:00:00.000Z</SentTime>
          <ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
        </FailedMessages>
        <FailedMessages>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>1601-01-01T00:00:00.000Z</SentTime>
          <ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
        </FailedMessages>
        <FailedMessages>
          <MessageSubject>Hey</MessageSubject>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>2013-01-07T02:51:17.000Z</SentTime>
          <ReceiveTime>2013-01-07T02:51:17.000Z</ReceiveTime>
          <MessageSize value="2881" />
        </FailedMessages>
        <StartTime>2013-01-07T01:52:46.000Z</StartTime>
        <EndTime>2013-01-07T04:41:59.000Z</EndTime>
      </FolderList>
      <StartTime>2013-01-07T01:52:43.000Z</StartTime>
      <EndTime>2013-01-07T04:41:59.000Z</EndTime>
    </EmailMigrationStatus>
    <StartTime>2013-01-07T01:52:43.000Z</StartTime>
    <EndTime>2013-01-07T04:41:59.000Z</EndTime>
  </StoreList>
</MigrationUserStatus>

使用此代码,我可以轻松创建 csv 行的第一部分:

$folder = "C:\temp"
$outfile = = [IO.File]::OpenWrite("alluserslogs.csv")
$csv = "User,Total Emails, Successful emails,Failed emails,Failures`r`n"

dir Status-*.log | foreach ( $_) {
[xml]$Status = Get-Content $_
$csvpt1 +=$Status.MigrationUserStatus.User + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

下一点是我要解开的地方。我想读取每个 FailedMessages 节点并将其构建到另一个逗号分隔的字符串

foreach ($FMessage in $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages) {
$csvpt2 +=$FMessage + ","
}

期望的输出:

GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z                1601-01-01T00:00:00.000Z,GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z                1601-01-01T00:00:00.000Z,.......

我在 $FMessage 中得到空白或方法调用失败,因为最后的 + "," 所以我需要修复这个问题。

然后我将连接成一个最终字符串并写入文件

$csv +=$csvpt1 + "," + $csvpt2
$outfile.WriteLine($csv)
}
$outfile.Close()

在添加的愿望清单中,能够为 n 列创建 csv 文件列标题失败也很棒,如最大数量的 FailedMessages 节点所示。

非常感谢您的帮助。

4

1 回答 1

1

Powershell 对 XML 有原生支持,也许这会帮助您入门?

它还有一个带有 Export-Csv 的本机 CSV 导出器 :)

[xml]$XMLfile = gc C:\Temp\migration.xml

$MasterArray = @()
$MasterArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures

$MasterArray.User = $XMLfile.MigrationUserStatus.user
$MasterArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
$MasterArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
$MasterArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
$MasterArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

$Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
$ConcatFailures = @()
foreach ($Failure in $Failures)
{
    $ConcatFailures += $Failure.ErrorMessage + "," + $Failure.SentTime + "," + $Failure.ReceivedTime
}

$MasterArray.Failures = $ConcatFailures -Join "|"
$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"

对于其他字段,您可以检查它们是否存在,如果它们很容易添加它们,这应该有效:

foreach ($Failure in $Failures)
{
    if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
    if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
    if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
    if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
    if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
}

要处理 xml 文件,您需要添加一个外部循环来遍历所有 xml 文件,然后将数据附加到您构建的数组中。这应该可以满足您的需求,并对使用的路径进行一些调整:

$XMLFiles = gci "C:\Temp\" -Filter "*.xml"
$MasterArray = @()

foreach ($XMLFile in $XMLFiles)
{
    [xml]$XMLfile = gc $XMLFile.FullName

    $TempArray = @()
    $TempArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures

    $TempArray.User = $XMLfile.MigrationUserStatus.user
    $TempArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
    $TempArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
    $TempArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
    $TempArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

    $Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
    $ConcatFailures = @()

    foreach ($Failure in $Failures)
    {
        if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
        if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
        if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
        if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
        if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
    }
    $TempArray.Failures = $ConcatFailures -Join "|"

    $MasterArray += $TempArray
}

$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"
于 2013-01-10T18:31:45.957 回答