2

我正在尝试转换这个 python 脚本(diff.py)

http://www.aaronsw.com/2002/diff/

在我的网站上进入完全相同的东西,即网络界面。他提供了您可以下载的脚本,我通过命令行让它在我的 Windows 计算机上运行,​​但我希望它也能在我的服务器上运行。我是如此接近。这是我到目前为止所拥有的。

这是我的html文档-

<form action="/cgi-bin/diff.py" method="get"><p>
<strong>Old URL:</strong> <input name="old" type="text"><br>
<strong>New URL:</strong> <input name="new" type="text"><br>
<input value="Diff!" type="submit">
</p></form>

这是我编辑的 diff.py 脚本,它几乎可以工作了 -

#!G:\Program Files\Python25\python.exe
"""HTML Diff: http://www.aaronsw.com/2002/diff
Rough code, badly documented. Send me comments and patches.

__author__ = 'Aaron Swartz <me@aaronsw.com>'
__copyright__ = '(C) 2003 Aaron Swartz. GNU GPL 2 or 3.'
__version__ = '0.22' """

import cgi
import cgitb; cgitb.enable()
form = cgi.FieldStorage()
reshtml = """Content-Type: text/html\n
<html>
<head><title>Test</title></head>
<body>
"""
print reshtml
a = form['old'].value
b = form['new'].value

import difflib, string

def isTag(x): return x[0] == "<" and x[-1] == ">"

def textDiff(a, b):
    """Takes in strings a and b and returns a human-readable HTML diff."""

    out = []
    a, b = html2list(a), html2list(b)
    s = difflib.SequenceMatcher(None, a, b)
    for e in s.get_opcodes():
        if e[0] == "replace":
            # @@ need to do something more complicated here
            # call textDiff but not for html, but for some html... ugh
            # gonna cop-out for now
            out.append('<del class="diff modified">'+''.join(a[e[1]:e[2]]) +   '</del><ins class="diff modified">'+''.join(b[e[3]:e[4]])+"</ins>")
        elif e[0] == "delete":
            out.append('<del class="diff">'+ ''.join(a[e[1]:e[2]]) + "</del>")
        elif e[0] == "insert":
            out.append('<ins class="diff">'+''.join(b[e[3]:e[4]]) + "</ins>")
        elif e[0] == "equal":
            out.append(''.join(b[e[3]:e[4]]))
        else: 
            raise "Um, something's broken. I didn't expect a '" + `e[0]` + "'."
    return ''.join(out)

def html2list(x, b=0):
    mode = 'char'
    cur = ''
    out = []
    for c in x:
        if mode == 'tag':
            if c == '>': 
                if b: cur += ']'
                else: cur += c
                out.append(cur); cur = ''; mode = 'char'
            else: cur += c
        elif mode == 'char':
            if c == '<': 
                out.append(cur)
                if b: cur = '['
                else: cur = c
                mode = 'tag'
            elif c in string.whitespace: out.append(cur+c); cur = ''
            else: cur += c
    out.append(cur)
    return filter(lambda x: x is not '', out)

if __name__ == '__main__':
    import sys
    try:
        a, b = sys.argv[1:3]
    except ValueError:
        print "htmldiff: highlight the differences between two html files"
        print "usage: " + sys.argv[0] + " a b"
        sys.exit(1)
    print textDiff(open(a).read(), open(b).read())

print '</body>'
print '</html>'

这是我在浏览器中得到的结果 -

htmldiff: highlight the differences between two html files usage: E:/xampp/cgi-bin/diff.py a b 

任何人都可以看到有什么问题吗?

好的,这是我使用 print open(a).read() 时的错误 ---

A problem occurred in a Python script. Here is the sequence of function calls leading up to the error, in the order they occurred.
 E:\xampp\cgi-bin\diff2.py in ()
   19 b = form['new'].value
   20 
   21 print open(a).read()
   22 
   23 
builtin open = <built-in function open>, a = 'http://www.google.com', ).read undefined

<type 'exceptions.IOError'>: [Errno 2] No such file or directory: 'http://www.google.com'
    args = (2, 'No such file or directory')
    errno = 2
    filename = 'http://www.google.com'
    message = ''
    strerror = 'No such file or directory'

好吧,我想我实际上是自己想出来的。以下是必要的更改。我在原始代码的开头停了下来-

#!G:\Program Files\Python25\python.exe
"""HTML Diff: http://www.aaronsw.com/2002/diff
Rough code, badly documented. Send me comments and patches.

__author__ = 'Aaron Swartz <me@aaronsw.com>'
__copyright__ = '(C) 2003 Aaron Swartz. GNU GPL 2 or 3.'
__version__ = '0.22' """


import cgi
import cgitb; cgitb.enable()
form = cgi.FieldStorage()
reshtml = """Content-Type: text/html\n
<html>
<head><title>Tonys Test</title></head>
<body>
"""
print reshtml
old2 = form['old'].value
new2 = form['new'].value

import urllib2

a = urllib2.urlopen(old2).read()
b = urllib2.urlopen(new2).read()

#print a
#print b

import difflib, string

好吧,我说得太早了。它有效,但没有突出差异。我只有旧版本的删除线。我尝试添加我剪下的那部分,据说它会突出显示,但它不起作用。我得到了我原来的错误声明。我会继续努力的。

好的,终于工作了。我不得不在最后添加这段代码 -

def htmlDiff(a, b):
    f1, f2 = a.find('</head>'), a.find('</body>')
    ca = a[f1+len('</head>'):f2]

    f1, f2 = b.find('</head>'), b.find('</body>')
    cb = b[f1+len('</head>'):f2]

    r = textDiff(ca, cb)
    hdr = '<style type="text/css"><!-- ins{background-color: #bbffbb} del{background-color: #ffcccc}--></style></head>'
    return b[:f1] + hdr + r + b[f2:]


print htmlDiff(a, b)
print '</body>'
print '</html>'

我在 0.1 版本下载中找到了这段代码。

4

1 回答 1

2

这个块是问题:

if __name__ == '__main__':
    import sys
    try:
        a, b = sys.argv[1:3]
    except ValueError:
        print "htmldiff: highlight the differences between two html files"
        print "usage: " + sys.argv[0] + " a b"
        sys.exit(1)

去掉它。

而这一行:

print textDiff(open(a).read(), open(b).read())

应该成为

print textDiff(a, b)
于 2012-11-24T20:27:04.380 回答