0

I writing app that connect to a web server (I am the owner of he server) sends information provided by the user, process that information and send result back to the application. The time needed to process the results depends on the user request (from few seconds to a few minutes).

I use a infinite loop to check if the file exist (may be there is a more intelligent approach... may be I could estimated the maximum time a request could take and avoid using and infinite loop)

the important part of the code looks like this

import time
import mechanize

br = mechanize.Browser()
br.set_handle_refresh(False)
proxy_values={'http':'proxy:1234'}
br.set_proxies(proxy_values)


While True:
    try:
        result=br.open('http://www.example.com/sample.txt').read()
        break
    except:
        pass
time.sleep(10)

Behind a proxy the loop never ends, but if i change the code for something like this,

time.sleep(200)
result=br.open('http://www.example.com/sample.txt').read()

i.e. I wait enough time to ensure that the file is created before trying to read it, I indeed get the file :-)

It seems like if mechanize ask for a file that does not exits everytime mechanize will ask again I will get no file...

I replicated the same behavior using Firefox. I ask for a non-existing file then I create that file (remember I am the owner of the server...) I can not get the file. And using mechanize and Firefox I can get deleted files...

I think the problem is related to the Proxy cache, I think I can´t delete that cache, but may be there is some way to tell the proxy I need to recheck if the file exists...

Any other suggestion to fix this problem?

4

1 回答 1

2

最简单的解决方案可能是添加一个(未使用的)GET 参数以避免缓存请求。

IE:

i = 0
While True:
    try:
        result=br.open('http://www.example.com/sample.txt?r=%d' % i).read()
        break
    except:
        i += 1
    time.sleep(10)

Web 应用程序应忽略额外参数。

HTTP HEAD 可能是执行此操作的正确方法,请参阅此问题以获取示例

于 2011-09-12T18:33:37.337 回答