我们正在使用 Microsoft HPC(高性能计算)。当作业运行时,我想查看各种 HPC 指标并将它们发布到 AWS Cloudwatch。以下是 AWS 站点上的脚本。整个脚本运行良好,但尝试写入 Cloudwatch 的最后一行失败。
有人收到这个错误吗?
Write-CWMetricData : A WebException with status NameResolutionFailure was thrown.
At D:\temp\HPCMetricstest.ps1:81 char:1
+ Write-CWMetricData -Namespace "HPC Cluster Metrics" -MetricData $m1,
$m2, $m3, $ ...
+ CategoryInfo : InvalidOperation:
(Amazon.PowerShe...etricDataCmdlet:WriteCWMetricDataCmdlet) [Write-CWM
etricData], InvalidOperationException
+ FullyQualifiedErrorId :mazon.Runtime.AmazonServiceException,Amazon.PowerShell.Cmdlets.CW.WriteCWMetricDataCmdl
et
#
# This PowerShell script computes metrics on the head node of an HPC
Pack cluster and publishes them to Amazon CloudWatch
#
# It must be called with the current region and stack name
# Properties of HPC Nod: NetBiosName, HealthState, State, Groups
[CmdletBinding()]
param(
[Parameter(Mandatory=$True,Position=1)]
[string]$Region,
[Parameter(Mandatory=$True,Position=2)]
[string]$Stack
)
Add-PSSnapIn Microsoft.HPC
Import-Module AWSPowerShell
$jobs = (Get-HpcJob -State Queued, Running -ErrorAction SilentlyContinue)
$tasks = ($jobs | Get-HpcTask -State Running, Queued -ErrorAction SilentlyContinue)
$nodes = (Get-HpcNode -GroupName ComputeNodes -State Online)
$jobCount = $jobs.Count
$taskCount = $task.Count
$coreHours = ($tasks | % { $_.Runtime.TotalHours * $_.MinCores } |
Measure-Object -Sum | Select-Object -ExpandProperty Sum)
$nodeCount = $nodes.Count
$coresPerMachine = ($nodes | Measure-Object -Property SubscribedCores -Average | Select-Object -ExpandProperty Average)
Write-Host "Cores per machine basam " $coresPerMachine
$machineHours = [System.Math]::Ceiling($coreHours / $coresPerMachine)
$globalHours = [System.Math]::Ceiling($machineHours / $nodeCount)
Function CreateMetric
{
param([string]$Name, [string]$Unit="Count", [string]$Value="0",
[string]$StackId, [System.DateTime]$When = (Get-
Date).ToUniversalTime())
$dim = New-Object Amazon.CloudWatch.Model.Dimension
$dim.Name = "StackId"
$dim.Value = $StackId
$dat = New-Object Amazon.CloudWatch.Model.MetricDatum
$dat.Timestamp = $When
$dat.MetricName = $Name
$dat.Unit = $Unit
$dat.Value = $Value
#Write-Host $dat.MetricName $dat.Value $dat.Unit $dat.Timestamp
$dat.Dimensions = New-Object -TypeName System.Collections.Generic.List[Amazon.CloudWatch.Model.Dimension]
$dat.Dimensions.Add($dim)
$dat
}
$now = (Get-Date).ToUniversalTime()
$m1 = (CreateMetric -Name "Job Count" -Value "$jobCount" -StackId
$Stack -When $now)
$m2 = (CreateMetric -Name "Task Count" -Value "$taskCount" -StackId
$Stack -When $now)
$m3 = (CreateMetric -Name "Core Hours" -Value "$coreHours" -StackId
$Stack -When $now)
$m4 = (CreateMetric -Name "Node Count" -Value "$nodeCount" -StackId
$Stack -When $now)
$m5 = (CreateMetric -Name "Cores Per Machine" -Value
"$coresPerMachine" -StackId $Stack -When $now)
$m6 = (CreateMetric -Name "Machine Hours" -Value "$machineHours" -
StackId $Stack -When $now)
$m7 = (CreateMetric -Name "Global Hours" -Value "$globalHours" -
StackId $Stack -When $now)
#Next line I am getting issue
Write-CWMetricData -Namespace "HPC Cluster Metrics" -MetricData
$m1, $m2, $m3, $m4, $m5, $m6, $m7 -Region $Region