node.js - 如何使用 Node.js 创建 html 文档的修改版本？

Question

我正在尝试这样做：

使用 Node 读取 html 文档“myDocument.html”
在 myDocument.html 的打开正文标记之后立即插入另一个名为“foo.html”的 html 文档的内容。
在 myDocument.html 的关闭正文标记之前插入另一个名为“bar.html”的 html 文档的内容。
保存“myDocument.html”的修改版本。

要执行上述操作，我需要使用 Node 搜索 DOM 以找到打开和关闭 body 标记。如何才能做到这一点？

score 1 · Accepted Answer

很简单，你可以使用 Node.JS 自带的原生 Filesystem 模块。( var fs = require("fs"))。这允许您读取 HTML 并将其转换为字符串，执行字符串替换功能，最后通过重写再次保存文件。

优点是这个解决方案是完全原生的，不需要外部库。它也完全忠实于原始 HTML 文件。

//Starts reading the file and converts to string.
fs.readFile('myDocument.html', function (err, myDocData) {
      fs.readFile('foo.html', function (err, fooData) { //reads foo file
          myDocData.replace(/\<body\>/, "<body>" + fooData); //adds foo file to HTML
          fs.readFile('bar.html', function (err, barData) { //reads bar file
              myDocData.replace(/\<\/body\>/, barData + "</body>"); //adds bar file to HTML
              fs.writeFile('myDocumentNew.html', myDocData, function (err) {}); //writes new file.
          });
      });
});

score 0 · Accepted Answer

使用Cheerio库，它有一个简化的 jQuery-ish API。

var cheerio = require('cheerio');
var dom = cheerio(myDocumentHTMLString);
dom('body').prepend(fooHTMLString);
dom('body').append(barHTMLString);
var finalHTML = dom.html();

需要明确的是，由于大量支持正则表达式的人已经成群结队地出现，是的，你需要一个真正的解析器。不，您不能使用正则表达式。阅读 Stackoverflow 首席开发人员Jeff Atwood 关于以 Cthulhu 方式解析 HTML 的文章。

score 0 · Accepted Answer

以一种简单但不准确的方式，您可以这样做：

str = str.replace(/(<body.*?>)/i, "$1"+read('foo.html'));

str = str.replace(/(<\/body>)/i, read('bar.html')+'$1');

如果 myDocument 内容包含多个 "<body ..' 或 '</body>'，例如在 javascript 中，并且 foo.html 和 bar.html 不能包含 '$1' 或 '$2'，它将不起作用。 .

如果您可以编辑 myDocument 的内容，那么您可以在其中留下一些“占位符”（作为 html 注释），例如

<!--foo.html-->

然后，很简单，只需替换此“占位符”即可。

node.js - 如何使用 Node.js 创建 html 文档的修改版本？

3 回答 3

Related

Reference