python - 使用 Google App Engine 下载链接

Question

我在 Google App Engine 上有一个静态网站，可以下载一些 .rar 文件。现在它由静态文件处理程序定义（app.yaml）处理：

handlers:
- url: /(.*\.(bz2|gz|rar|tar|tgz|zip))
  static_files: static/\1
  upload: static/(.*\.(bz2|gz|rar|tar|tgz|zip))

现在我想做的是提供一个像 /download?MyFile.rar 这样的下载链接，这样我就可以计算下载量并查看谁在盗链。

只要网站使用此网址，我就不想阻止盗链（真实路径将被隐藏/不可用）。这样我就可以计算下载量，即使它来自外部（Google Analytics 或 Clicky 显然无法处理，并且日志保留时间仅为 90 天左右，不方便用于此目的）。

问题是：如何制作一个可以为用户启动文件下载的python处理程序？就像我们在很多 php/asp 网站上看到的那样。

经过大量搜索并阅读了这两个线程（我如何让 Google App Engine 有一个从数据库下载内容的下载链接？，google app engine download a file contains files），似乎我可以有类似的东西：

self.response.headers['Content-Type'] = 'application/octet-stream'
self.response.out.write(filecontent) # how do I get that content?
#or
self.response.headers["Content-Type"] = "application/zip"
self.response.headers['Content-Disposition'] = "attachment; filename=MyFile.rar" # does that work? how do I get the actual path?

我确实读到处理程序只能运行有限的时间，所以它可能不适用于大文件？

任何指导将不胜感激！

谢谢。

罗姆兹

编辑： 让它工作，它让我有一个处理所有 .rar 文件的处理程序。它让我拥有看起来像直接链接（example.com/File.rar）但实际上是在 python 中处理的 url（因此我可以检查引用者、计算下载量等）。

这些文件实际上位于不同的子文件夹中，并且由于生成路径的方式而受到保护，不会受到真正的直接下载。我不知道是否应该过滤掉其他字符（除了'/'和'\'），但是这样任何人都不应该能够访问父文件夹中的任何其他文件或其他任何文件。

虽然我真的不知道这一切对我的配额和文件大小限制意味着什么。

应用程序.yaml

handlers:
- url: /(.*\.rar)
  script: main.app

主文件

from google.appengine.ext import webapp
from google.appengine.api import memcache
from google.appengine.ext import db
import os, urlparse

class GeneralCounterShard(db.Model):
    name = db.StringProperty(required=True)
    count = db.IntegerProperty(required=True, default=0)

def CounterIncrement(name):
    def txn():
        counter = GeneralCounterShard.get_by_key_name(name)
        if counter is None:
            counter = GeneralCounterShard(key_name=name, name=name)
        counter.count += 1
        counter.put()
    db.run_in_transaction(txn)
    memcache.incr(name) # does nothing if the key does not exist

class MainPage(webapp.RequestHandler):
    def get(self):

    referer = self.request.headers.get("Referer")
    if (referer and not referer.startswith("http://www.example.com/")):
        self.redirect('http://www.example.com')
        return

    path = urlparse.urlparse(self.request.url).path.replace('/', '').replace('\\', '')
    fullpath = os.path.join(os.path.dirname(__file__), 'files/'+path)
    if os.path.exists(fullpath):
        CounterIncrement(path)
        self.response.headers['Content-Type'] = 'application/zip'
        self.response.headers["Content-Disposition"] = 'attachment; filename=' + path
        self.response.out.write(file(fullpath, 'rb').read())
    else:
        self.response.out.write('<br>The file does not exist<br>')


app = webapp.WSGIApplication([('/.*', MainPage)], debug=False)

score 0 · Accepted Answer

您可以将文件内容存储在 blob 存储中并从那里提供服务，但如果文件很大且客户端速度较慢，您将达到时间限制（约 30 秒）

另一种选择是有一个简单的处理程序来计算下载，然后发出临时重定向（HTTP 302）到真正的下载链接。它可以让您提供大文件，但仍然可以热链接真实文件而不是处理程序 URL。

score 0 · Accepted Answer

您可以尝试使用self.resquest.referer.

这是你如何做到的。有一个“单击此处”下载链接到您的文件下载页面，然后您可以有一个FileDownloadHandler，其中name//id或 whaterver 作为参数传递，在此处理程序中，检查引用者是否是“下载页面”，这样你就知道了如果请求是有效下载。如果是，则提供文件，如果不是，则重定向或执行一些错误。

只是一个想法

python - 使用 Google App Engine 下载链接

2 回答 2

Related

Reference