0

现在我有了这个 shell 代码,我需要将 HTML 标记添加到简单的文本中。换句话说,我需要将文本格式化为基本的 HTML 代码。

现在我有了这个 shell 脚本:

#!/bin/sh
# t2h {$1} html-ize a text file and save as foo.htm
NL="
"
cat $1 \
| sed -e 's/ at / at /g' \
| sed -e 's/[[:cntrl:]]/ /g'\
| sed -e 's/^[[:space:]]*$//g' \
| sed -e '/^$/{'"$NL"'N'"$NL"'/^\n$/D'"$NL"'}' \
| sed -e 's/^$/<\/UL><P>/g' \
| sed -e '/<P>$/{'"$NL"'N'"$NL"'s/\n//'"$NL"'}'\
| sed -e 's/<P>[[:space:]]*"/<P><UL>"/' \
| sed -e 's/^[[:space:]]*-/<BR> -/g' \
| sed -e 's/http:\/\/[[:graph:]\.\/]*/<A HREF="&">[&]<\/A> /g'\
                                > foo.htm

这是一个文本示例:

"Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.     "

"Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.     "

它需要是这样的:

<html>
<p>Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.</p><br />
<br />
<p>Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.</p>
</html>
4

2 回答 2

0

嗨,试试这个 bash 脚本 DataFile 是你的文件,其中包含要格式化的文本

#!/bin/bash
echo '<html>' >foo.htm
cat DataFile |tr -d '"'|grep -v "^$" |\
while read Line
do
cat <<eof
  <p>${Line}</p><br><br>
eof
done >>foo.htm
echo '</html>' >>foo.htm
于 2013-10-31T02:47:13.500 回答
0

我看到您想使用“sed”来替换换行符,无论出于何种原因。
这行不通。
“sed”对于换行是盲目的,因为“sed”是面向工作行的。这意味着换行符在开始评估之前被“sed”剥离。

最简单的解决方法是在使用“sed”开始之前使用“tr”将每个换行符替换为另一个字符。这个其他字符不应该是可以在文本文件中找到的字符(或者您必须以一种或另一种方式组成自己的转义序列)。

如果您更正了这些错误但仍然无法正常工作,请发布您更正的代码。
并发布您期望每个 sed 命令执行的操作。

PS 使用 bash 的原生字符串处理可能会更好。

于 2013-10-30T18:52:37.023 回答