2

我想比较 Html 文档天气是否存在具有相同排列的相同标签,而不管不同的内部文本和属性值是否不同。我只想比较一般的标签结构。如

<html>
<head>
</head> 
<body>
<span class="my paragraph">comparison of general tag structure of html</span>
</body>
</html>

<html>
<head>
</head> 
<body>
<span class="Mega Offer">free membership offer</span>
</body>
</html>

是一样的

<html>
<head><title>Different</title>
</head> 
<body>
<span class="my paragraph">comparison of general tag structure of html</span>
</body>
</html>

不一样,因为标签的html结构中有一个额外的标题标签,而不管内部值和属性值是否相同。

4

2 回答 2

0

If you are willing to use php there are several functions like preg_match that will look for patterns. You could use file to read the html file into an array, each new line being another entry in the array. Then do the same for the other html file. Then you can go and search for the 1st tag(aka: something that starts with <) and read the rest of the line until >. Then go and search the other html file for the same tag, counting how many times that tag appears. Rinse and repeat.

于 2013-07-26T16:08:20.247 回答
0

我将分两个阶段进行:

第 1 阶段(检查是否相等):
删除标签和属性之间的所有内容,然后将结果作为(不区分大小写的)字符串进行比较。

如果它们不同,也是这样:

第 2 阶段(发现差异):
此阶段高度取决于您要报告的差异,因此我无法给出具体建议如何实施。

于 2018-05-14T11:00:43.767 回答