python - 来自 App Engine 的 Google Cloud SQL 的连接限制是什么，以及如何最好地重用数据库连接？

Question

我有一个 Google App Engine 应用程序，它使用 Google Cloud SQL 实例来存储数据。我需要我的实例能够通过 restful 调用一次为数百个客户端提供服务，每个调用都会导致一个或几个数据库查询。我已经包装了需要数据库访问的方法，并将数据库连接的句柄存储在 os.environ 中。请参阅此SO 问题/答案，了解我的基本操作方式。

但是，一旦有几百个客户端连接到我的应用程序并触发数据库调用，我就会开始在 Google App Engine 错误日志中收到这些错误（当然，我的应用程序返回 500）：

could not connect: ApplicationError: 1033 Instance has too many concurrent requests: 100 Traceback (most recent call last): File "/base/python27_run

有经验的 Google App Engine 和 Google Cloud SQL 用户有什么建议吗？提前致谢。

这是我在需要数据库连接的方法周围使用的装饰器的代码：

def with_db_cursor(do_commit = False):
    """ Decorator for managing DB connection by wrapping around web calls.
    Stores connections and open connection count in the os.environ dictionary
    between calls.  Sets a cursor variable in the wrapped function. Optionally
    does a commit.  Closes the cursor when wrapped method returns, and closes
    the DB connection if there are no outstanding cursors.

    If the wrapped method has a keyword argument 'existing_cursor', whose value
    is non-False, this wrapper is bypassed, as it is assumed another cursor is
    already in force because of an alternate call stack.

    Based mostly on post by : Shay Erlichmen
    At: https://stackoverflow.com/a/10162674/379037
    """

    def method_wrap(method):
        def wrap(*args, **kwargs):
            if kwargs.get('existing_cursor', False):
                #Bypass everything if method called with existing open cursor
                vdbg('Shortcircuiting db wrapper due to exisiting_cursor')
                return  method(None, *args, **kwargs)

            conn = os.environ.get("__data_conn")

            # Recycling connection for the current request
            # For some reason threading.local() didn't work
            # and yes os.environ is supposed to be thread safe 
            if not conn:                    
                conn = _db_connect()
                os.environ["__data_conn"] = conn
                os.environ["__data_conn_ref"] = 1
                dbg('Opening first DB connection via wrapper.')
            else:
                os.environ["__data_conn_ref"] = (os.environ["__data_conn_ref"] + 1)
                vdbg('Reusing existing DB connection. Count using is now: {0}',
                    os.environ["__data_conn_ref"])        
            try:
                cursor = conn.cursor()
                try:
                    result = method(cursor, *args, **kwargs)
                    if do_commit or os.environ.get("__data_conn_commit"):
                        os.environ["__data_conn_commit"] = False
                        dbg('Wrapper executing DB commit.')
                        conn.commit()
                    return result                        
                finally:
                    cursor.close()                    
            finally:
                os.environ["__data_conn_ref"] = (os.environ["__data_conn_ref"] -
                        1)  
                vdbg('One less user of DB connection. Count using is now: {0}',
                    os.environ["__data_conn_ref"])
                if os.environ["__data_conn_ref"] == 0:
                    dbg("No more users of this DB connection. Closing.")
                    os.environ["__data_conn"] = None
                    db_close(conn)
        return wrap
    return method_wrap

def db_close(db_conn):
    if db_conn:
        try:
            db_conn.close()
        except:
            err('Unable to close the DB connection.', )
            raise
    else:
        err('Tried to close a non-connected DB handle.')

score 15 · Accepted Answer

简短的回答：您的查询可能太慢了，并且 mysql 服务器没有足够的线程来处理您尝试发送的所有请求。

长答案：

作为背景，Cloud SQL 有两个相关的限制：

连接：这些对应于代码中的“conn”对象。服务器上有相应的数据结构。一旦这些对象过多（当前配置为 1000 个），最近最少使用的将自动关闭。当您下方的连接关闭时，您将在下次尝试使用该连接时收到未知连接错误 (ApplicationError: 1007)。
并发请求：这些是在服务器上执行的查询。每个正在执行的查询都会占用服务器中的一个线程，因此限制为 100。当并发请求过多时，后续请求将被拒绝并出现您收到的错误（ApplicationError: 1033）

听起来连接限制不会影响您，但我想提一下以防万一。

对于并发请求，增加限制可能会有所帮助，但通常会使问题变得更糟。我们过去看到过两种情况：

死锁：长时间运行的查询正在锁定数据库的关键行。所有后续查询都阻塞在该锁上。应用程序在这些查询上超时，但它们继续在服务器上运行，占用这些线程直到触发死锁超时。
慢查询：每个查询都非常非常慢。这通常发生在查询需要临时文件排序时。应用程序超时并重试查询，而第一次尝试查询仍在运行并计入并发请求限制。如果你能找到你的平均查询时间，你可以估计你的 mysql 实例可以支持多少 QPS（例如，每个查询 5 毫秒意味着每个线程 200 QPS。由于有 100 个线程，你可以做 20,000 QPS。50 毫秒每个查询意味着 2000 QPS。）

您应该使用EXPLAIN和SHOW ENGINE INNODB STATUS来查看这两个问题中的哪一个。

当然，也有可能你只是在你的实例上驱动了大量的流量并且没有足够的线程。在这种情况下，无论如何，您可能会最大化实例的 cpu，因此添加更多线程将无济于事。

score 5 · Accepted Answer

我从文档中阅读并注意到有 12 个连接/实例限制：

查找“每个 App Engine 实例与 Google Cloud SQL 实例的并发连接不能超过 12 个”。在https://developers.google.com/appengine/docs/python/cloud-sql/

python - 来自 App Engine 的 Google Cloud SQL 的连接限制是什么，以及如何最好地重用数据库连接？

2 回答 2

Related

Reference