多币种 REGEXP_SUBSTR Oracle11g
第一次在这里发布问题,所以我希望我不要把它弄得太糟糕。
我创建了一个查询,可以提取各种客户订单详细信息,包括收费价格、产品标价和大单,即通过提醒电子邮件发送给客户的价格。
使用 REGEXP_SUBSTR,我可以匹配电子邮件 HTML 内容中各种货币的所有价格,但我遇到了某些缺少逗号或句点的价格货币缩写组合的输出问题:即 123 kr、999 Pesos 或 1 050 Kč。
如何使上述情况与其他价格格式的输出相匹配?
我从 Gary 的回答中汲取了很多“灵感”:Regex currency validation。
数据源 HTML
所需值从 开始<!-- START Price Exp.. -->
:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Thank you for using us</title>
<style type="text/css">
.ReadMsgBody {
width: 100%;
}
.ExternalClass {
width: 100%;
}
BODY {
font-family: OpenSans, Arial, Helvetica, sans-serif;
font-size:13px;
color:#555555;
}
TD {
font-family: OpenSans, Arial, Helvetica, sans-serif;
font-size:13px;
color:#555555;
vertical-align: top;
}
A {
color: #f48024;
}
IMG {
display:block;
border: none;
}
H1 {
font-size: 18pt;
}
H2 {
font-size: 15pt;
}
H1, H2, H3, P, UL, LI {
margin: 0;
padding: 0;
}
</style>
</head>
<body style="margin: 0; padding: 0; background-color: #eeeeee" bgcolor="#eeeeee">
<table width="100%" border="0" cellpadding="0" cellspacing="0" style="margin: 0; padding: 0; ">
<tbody>
<tr>
<td align="center" width="100%" >
<!-- TOP-->
<table bgcolor="#eeeeee" border="0" cellpadding="0" cellspacing="0" style="background-color: #eeeeee; width:100%; max-width:900px; ">
<tbody>
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
</tbody>
</table>
<!-- END TOP-->
<!-- LOGO -->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; background-color: #fff; max-width:900px;">
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="background-color: #fff; text-align: center; max-width:650px;">
<tr>
<td align="center" height="40" bgcolor="#fff" style="background-color: #fff; vertical-align: middle; text-align: center; ">
<a href="https://www.company.com/" target="_blank"><img align="center" style="" src="https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon@2.png?v=73d79a89bded" style="display:block" alt="" /></a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<!-- END LOGO-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; max-width:900px; background-color: #fff;">
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; background-color: #fff; max-width:800px; padding-left:10px; padding-right:10px;">
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: center; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:22px; color:#41424e; line-height: 1.4"><b>Example Template</b></p>
</td>
</tr>
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Dear customername,</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Your productname - 1PC has been successfully renewed.</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Details of your sub below.</p>
</td>
</tr>
<tr>
<td height="40" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!-- sum-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#F2F2F6" style="width:100%; background-color: #F2F2F6; max-width:900px;">
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#F2F2F6" style="width:100%; background-color: #F2F2F6; max-width:800px; padding-left:10px; padding-right:10px;">
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: center; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:20px; color:#41424e; line-height: 1.4"><b>Your Auto-Renewal Sub</b></p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Product</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">Productname - 1 PC</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Order ID</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">12131415161</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Exp Prices charged</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<!--START Price Exp, templates could be in numerous different languages but info like i.e. customername, productname, Order ID, Tracking IDs will always use the same format. -->
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">$69.99 (a tax message)</p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">123 kr (b tax message)</p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">999 Pesos (c tax message)</p>
</td>
</tr>
<!--END Price Exps -->
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Automatically renewed</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">May 20, 2018</p>
</td>
</tr>
<tr>
<td height="42" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!--END sum -->
<!-- white-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; max-width:900px; background-color: #fff;">
<tr>
<td height="15" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; background-color: #fff; max-width:800px; padding-left:10px; padding-right:10px;">
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">If you’d like to check your order status, please sign in to <a href="https://www.company.com/en-us/order?pgm=6916670010" target="_blank">company.com/orders</a> with the login credentials below.</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Order ID:</b> 12131415161</p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Password:</b> stAcKoverFlOwrocks</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Your Plan</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><strong>Auto-Renewal Terms</strong><p>By completing your purchase, you have authorized us to do a bunch of legal stuff.</p>
</td>
</tr>
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Need help?</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td style="vertical-align: middle;">
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><a href="https://company.com/en_US/support" target="_blank">company.com/help</a></p>
</td>
</tr>
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">Thanks for trusting us.</p>
</td>
</tr>
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!-- END white -->
<!--FOOTER-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#777684" style="width:100%; background-color: #E7E7EF; max-width:900px;">
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#777684" style="width:100%; background-color: #E7E7EF; max-width:900px;">
<tr>
<td>
<table><tr><td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:14px; color:#767683; line-height: 1.4">Trouble installing? <u><a href="https://www.company.com/en-us/faq.php" target="_blank">Visit FAQ</a></u></p>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:14px; color:#767683; line-height: 1.4">Curious for more? <u><a href="https://www.company.com/en-us" target="_blank">Find more</a></u></p>
</td></tr></table>
</td>
<td>
<table align="right" >
</tr> </table>
</td>
</tr>
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!--END FOOTER -->
</td>
</tr>
</tbody>
</table>
</body>
</html>
正则表达式
(NT\$|SAR)\s(\d{2,5})|\d{1,4}([.,]\d{3})*([\s.,]\d{2,3}|[^\W]\d+(\d{1,4})*\s(kr|zł|Pesos|Kč|Ft|บาท|SAR|₪))
匹配所有
- 1.4">$19.99(一些随机文本)
- 1.4">20.00 雷亚尔
- 1.4">20.00€</li>
- 1.4">€25,99
- 1.4">£15.99
- 1.4"> 123 克朗
- 1.4"> 1234 英尺
- 1.4"> 999 比索
输出
- 19.99
- 20.00
- 20.00
- 25.99
- 15.99
- 123 克朗
- 1234 英尺
- 999 比索
最后三个示例不应有空格和/或数字后面的任何字母。
如何从输出中删除它们但保留数字?
我意识到这可能是由于我拥有多个捕获组,所以我看到了三个潜在的解决方案:
- 改进正则表达式以消除组的过度使用(我不够熟练,无法弄清楚这一点)......
- 以某种方式编写导致所需输出的正则表达式非捕获组。我很遗憾地了解到 (?:) 根本行不通。
- 利用 SQL 函数参数从 REGEXP_SUBSTR 中选择子表达式。但是,这似乎不允许输出中有多个子表达式。
SQL
SELECT
REPLACE(REPLACE(REGEXP_SUBSTR(nnc.MESSAGE, '(NT\$|SAR)\s(\d{2,5})|\d{1,4}([.,]\d{3})*([\s.,]\d{2,3}|[^\W]\d+(\d{1,4})*\s(kr|zł|Pesos|Kč|Ft|บาท|SAR|₪))'),',','.'),' ','') AS EMAIL_PRICE_SENT
FROM tablename
WHERE clause;
这是带有几个嵌套 REPLACE 函数的完整语句,用于将输出格式化为系统格式。
- 请参阅此处的正则表达式:Regex 101 Link。
我知道这是错误的语言,因此它不会提供 100% 准确的测试,但我发现它在针对数据库运行之前非常有帮助。我总是乐于接受更好的工具建议!
我在这方面花费的时间比我引以为豪的要多,因此我们将不胜感激任何帮助。
谢谢,尼克