0

我正在尝试清理 cms 数据库:所有内容都具有内联样式,我需要剥离它们。

我有很多嵌套标签,所以我试图用(我确定标题没有嵌套)替换<span>标签,然后用 HTMLPurifier 我将清理其他标签。<h3>

我写了这一行来替换<span>标签<h3>

$string = preg_replace( '/<span style="line-height: 17pt; font-family: helvetica; color: rgb\(85, 85, 85\); font-size: 13pt; font-weight: bold;">(.*?)<\/span>/', '<h3>$1</h3>',$string);

它适用于除此之外的所有情况:

<span style="line-height: 17pt; font-family: helvetica; color: rgb(85, 85, 85); font-size: 13pt; font-weight: bold;">"Rischio obsolescenza" per i lettori Blu-ray</span>

也许文本中的“是问题所在。

我怎样才能解决这个问题?

4

1 回答 1

1

No, the quotes aren't the problem, and the regex does match in my tests. Are you sure you don't have a newline somewhere in-between, because the dot does not match them unless you use the /s modifier. So, please try

$string = preg_replace( '/<span style="line-height: 17pt; font-family: helvetica; color: rgb\(85, 85, 85\); font-size: 13pt; font-weight: bold;">(.*?)<\/span>/s', '<h3>$1</h3>',$string);
于 2013-06-21T07:50:35.713 回答