我有段落格式的文本,每个段落文章上方总是有一个日期。问题是在每篇文章之后,都有未知的换行符,它们是不同类型的 unicode 换行符。我需要删除每个段落之间的换行符的每个实例,并将其替换为两个\n\n
.
所以从这
05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
对此
05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
我尝试使用preg_replace()
但它不匹配每个实例?
$text = preg_replace('/\r?\n+(?=\d{2}\/\d{2})/', "\n\n", $text);