4

我可以找到一些解释如何使用包的文档tqdm,但从中我无法弄清楚在线下载数据时如何生成进度表。

下面是我从 ResidentMario 复制的用于下载数据的示例代码

def download_file(url, filename):
    """
    Helper method handling downloading large files from `url` to `filename`. Returns a pointer to `filename`.
    """
    r = requests.get(url, stream=True)
    with open(filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024): 
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
    return filename


dat = download_file("https://data.cityofnewyork.us/api/views/h9gi-nx95/rows.csv?accessType=DOWNLOAD",
                    "NYPD Motor Vehicle Collisions.csv")

谁能告诉我如何在这里使用 tqdm 包来显示下载进度?

谢谢

4

3 回答 3

7

截至目前,我做这样的事情:

def download_file(url, filename):
    """
    Helper method handling downloading large files from `url` to `filename`. Returns a pointer to `filename`.
    """
    chunkSize = 1024
    r = requests.get(url, stream=True)
    with open(filename, 'wb') as f:
        pbar = tqdm( unit="B", total=int( r.headers['Content-Length'] ) )
        for chunk in r.iter_content(chunk_size=chunkSize): 
            if chunk: # filter out keep-alive new chunks
                pbar.update (len(chunk))
                f.write(chunk)
    return filename
于 2017-02-06T15:28:26.107 回答
1

pbar.clear() 和 pbar.close()

手动更新进度条,对于读取文件等流很有用。 https://github.com/tqdm/tqdm#returns

def download_file(url, filename):
"""
Helper method handling downloading large files from `url` to `filename`. Returns a pointer to `filename`.
"""
    r = requests.get(url, stream=True)

    with open(filename, 'wb') as f:
        pbar = tqdm(unit="B", unit_scale=True, unit_divisor=1024, total=int( r.headers['Content-Length'] ))
        pbar.clear()  #  clear 0% info
        for chunk in r.iter_content(chunk_size=1024): 
            if chunk: # filter out keep-alive new chunks
                pbar.update(len(chunk))
                f.write(chunk)
        pbar.close()
    return filename
于 2021-12-18T04:05:29.803 回答
-2

感谢 silmaril,但下面的工作对我来说更有意义。

def download_file(url, filename):
    r = requests.get(url, stream=True)
    filelength = int(r.headers['Content-Length'])

    with open(filename, 'wb') as f:
        pbar = tqdm(total=int(filelength/1024))
        for chunk in r.iter_content(chunk_size=1024):
            if chunk:                   # filter out keep-alive new chunks
                pbar.update ()
                f.write(chunk)
于 2017-11-03T20:29:58.147 回答