有没有办法让 C3P0(或 DBCP)自动重试由于网络问题而失败的查询。该代码工作正常,但偶尔会发生以下异常(没有特定查询,发生在各种查询中,甚至是每隔几秒发生一次的简单行计数,因此它确实可以工作,但随后突然失败:
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at sun.reflect.GeneratedConstructorAccessor89.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3567)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3456)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3997)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2468)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2629)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2719)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)
at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.execute(NewProxyPreparedStatement.java:989)
at com.outboundengine.database.ClickstreamIdentityDbo.insertIgnore(ClickstreamIdentityDbo.java:65)
... 22 more
Caused by: java.net.SocketException: Connection timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3014)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3467)
... 31 more
我开始每秒在应用服务器和数据库服务器之间运行 traceroute,并得到一堆如下所示的跟踪:
Wed Apr 17 15:52:41 UTC 2013
traceroute to XXX.XXXXXXXXXX.com (XXX.XXX.XXX.XXX), 30 hops max, 60 byte packets
1 ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.352 ms 0.443 ms 0.418 ms
2 ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.588 ms ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.625 ms ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.543 ms
3 ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.867 ms ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 33.166 ms 33.505 ms
4 ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 1.554 ms ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.435 ms ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.756 ms
5 ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.830 ms ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.803 ms ip-XXX-XXX-XXX-202.ec2.internal (XXX.XXX.XXX.XXX) 1.107 ms
6 * ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 1.449 ms *
7 ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.715 ms ip-XXX-XXX-XXX-202.ec2.internal (XXX.XXX.XXX.XXX) 0.973 ms 1.240 ms
8 * * *
9 * * *
10 * * *
11 * * *
12 * * ip-XXX-XXX-XXX-XXX.ec2.internal (XXX.XXX.XXX.XXX) 0.445 ms
...(在持续 10-60 秒的“中断”间隔期间还有更多这些,但连接需要 10 分钟才能超时)。
C3P0 配置如下:
<Resource name="jdbc/somename"
auth="Container"
factory="org.apache.naming.factory.BeanFactory"
type="com.mchange.v2.c3p0.ComboPooledDataSource"
driverClass="com.mysql.jdbc.Driver"
maxPoolSize="100"
minPoolSize="10"
acquireIncrement="10"
automaticTestTable="testtable"
testConnectionOnCheckin="true"
idleConnectionTestPeriod="600"
user="XXX"
password="XXX"
jdbcUrl="jdbc:mysql://XXXXXXXXXX:3306/dbname?autoReconnect=true"
/>
在异常发生期间,每个跟踪路由都在不同的地方设置了 *。
这是在 Amazon EC2 上运行的(位于弗吉尼亚州的位置,该位置因丢包率高而臭名昭著,导致这些超时)。该代码是由 C3P0 db 连接池支持的简单 JDBC 查询(并且在没有网络问题的情况下完美运行)。C3P0 重新连接并在超时后所有连接,但超时大约需要 10 分钟,当网络稳定时一切恢复正常,但我们每天至少会遇到一次这些中断,并且查询会丢失,需要重试。
数据库连接池中是否有一个设置可以让我免于更改代码中的所有查询,还是我必须编写一个简单的循环,在放弃之前重试失败几次?有没有其他人在 EC2 上遇到过这些问题并找到了解决方案?