你能帮我用正则表达式解析 EML 文本吗?
我想单独获得:
1)。Content-Transfer-Encoding: base64 和 --=_alternative 之间的文本,如果上面有 Content-Type: text/html
2)。Content-Transfer-Encoding: base64 和 --=_related 之间的文本,如果上面有两行 Content-Type: image/jpeg
请看一下powershell中的代码和平:
$text = @"
--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64
111111111111111111111111111111111111111111111111111111
--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64
222222222222222222222222222222222222222222222222222222
--=_alternative XXXXXXXXXXXXXX_=--
--=_related XXXXXXXXXXXXXX_=--_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
333333333333333333333333333333333333333333333333333333
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
444444444444444444444444444444444444444444444444444444
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
555555555555555555555555555555555555555555555555555555
--=_related XXXXXXXXXXXXXX_=--
"@
$regex1 = "(?ms).+?Content-Transfer-Encoding: base64(.+?)--=_alternative"
$text1 = ([regex]::Matches($text,$regex1) | foreach {$_.groups[1].value})
Write-Host "text1 : " -fore red
Write-Host $text1
#I want to get as output elements (of array, maybe, or one after another)
#1). text between Content-Transfer-Encoding: base64 and --=_alternative, if there is above line Content-Type: text/html
#this
#1111111111111111111111111111111111111111111111111111111
#then this
#2222222222222222222222222222222222222222222222222222222
$regex2 = "(?ms).+?Content-Transfer-Encoding: base64(.+?)--=_related"
$text2 = ([regex]::Matches($text,$regex2) | foreach {$_.groups[1].value})
#I want to get as output elements (of array, maybe, or one after another)
#2). text between Content-Transfer-Encoding: base64 and --=_related, if there is two lines above line Content-Type: image/jpeg
#this
#3333333333333333333333333333333333333333333333333333333
#then this
#4444444444444444444444444444444444444444444444444444444
#then this
#5555555555555555555555555555555555555555555555555555555
Write-Host "text2 : " -fore red
Write-Host $text2
谢谢你的帮助。祝你今天过得愉快。
PS 基于 Jessie Westlake 的代码,这里是 RegEx 的一个小编辑版本,对我有用:
$files = Get-ChildItem -Path "\\<SERVER_NAME>\mailroot\Drop"
Foreach ($file in $files){
$text = Get-Content $file.FullName
$RegexText = '(?:Content-Type: text/html.+?Content-Transfer-Encoding: base64(.+?)(?:--=_))'
$RegexImage = '(?:Content-Type: image/jpeg.+?Content-Transfer-Encoding: base64(.+?)(?:--=_))'
$TextMatches = [Regex]::Matches($text, $RegexText, [System.Text.RegularExpressions.RegexOptions]::Singleline)
$ImageMatches = [Regex]::Matches($text, $RegexImage, [System.Text.RegularExpressions.RegexOptions]::Singleline)
If ($TextMatches[0].Success)
{
Write-Host "Found $($TextMatches.Count) Text Matches:"
Write-Output $TextMatches.ForEach({$_.Groups[1].Value})
}
If ($ImageMatches[0].Success)
{
Write-Host "Found $($ImageMatches.Count) Image Matches:"
Write-Output $ImageMatches.ForEach({$_.Groups[1].Value})
}
}