4

我们的生产服务器(apache-tomcat-7.0.6)遇到了这个问题,它使用 MySQL 作为 DB 运行 Spring-JPA-Hibernate 应用程序。在此问题期间,服务器变慢,并且在一分钟内它变得无响应,活动的 tomcat 线程数达到 200 个(我们使用 javamelody 来观察这些事情)。日志看起来像:

2012-04-07 07:53:48,058 DEBUG AuthenticationHandler:("http-bio-7006"-exec-304):35 - Authentication handler : ~
2012-04-07 07:53:48,059 DEBUG AuthenticationHandler:("http-bio-7006"-exec-304):45 - Service Id : 1333765428059-/xxx-framework-service/platform/items/search/
2012-04-07 07:53:48,060 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-304):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:53:48,060 DEBUG JpaTransactionManager:("http-bio-7006"-exec-304):365 - Creating new transaction with name [com.xxx.framework.manager.ItemManager.search]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT,readOnly; ''
2012-04-07 07:53:48,060 DEBUG JpaTransactionManager:("http-bio-7006"-exec-304):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@6aee43a9] for JPA transaction
2012-04-07 07:53:53,583 DEBUG AuthenticationHandler:("http-bio-7006"-exec-217):35 - Authentication handler : System~System
2012-04-07 07:53:53,584 DEBUG AuthenticationHandler:("http-bio-7006"-exec-217):45 - Service Id : 1333765433584-/xxx-framework-service/platform/skus/get_sku_details
2012-04-07 07:53:53,585 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-217):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:53:53,586 DEBUG JpaTransactionManager:("http-bio-7006"-exec-217):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getSkuDetails]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:53:53,586 DEBUG JpaTransactionManager:("http-bio-7006"-exec-217):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@7cb3b78] for JPA transaction
2012-04-07 07:54:04,297 DEBUG AuthenticationHandler:("http-bio-7006"-exec-311):35 - Authentication handler : System~System
2012-04-07 07:54:04,298 DEBUG AuthenticationHandler:("http-bio-7006"-exec-311):45 - Service Id : 1333765444298-/xxx-framework-service/platform/skus/get_sku_details
2012-04-07 07:54:04,299 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-311):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:04,299 DEBUG JpaTransactionManager:("http-bio-7006"-exec-311):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getSkuDetails]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:54:04,299 DEBUG JpaTransactionManager:("http-bio-7006"-exec-311):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@d8c7ec2] for JPA transaction
2012-04-07 07:54:05,713 DEBUG AuthenticationHandler:("http-bio-7006"-exec-252):35 - Authentication handler : System~System
2012-04-07 07:54:05,714 DEBUG AuthenticationHandler:("http-bio-7006"-exec-252):45 - Service Id : 1333765445714-/xxx-framework-service/platform/skus/get_sku_details
2012-04-07 07:54:05,715 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-252):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:05,715 DEBUG JpaTransactionManager:("http-bio-7006"-exec-252):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getSkuDetails]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:54:05,715 DEBUG JpaTransactionManager:("http-bio-7006"-exec-252):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@685eb738] for JPA transaction
2012-04-07 07:54:16,293 DEBUG AuthenticationHandler:("http-bio-7006"-exec-282):35 - Authentication handler : System~System
2012-04-07 07:54:16,294 DEBUG AuthenticationHandler:("http-bio-7006"-exec-282):45 - Service Id : 1333765456294-/xxx-framework-service/platform/skus/get_sku_details
2012-04-07 07:54:16,295 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-282):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:16,295 DEBUG JpaTransactionManager:("http-bio-7006"-exec-282):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getSkuDetails]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:54:16,296 DEBUG JpaTransactionManager:("http-bio-7006"-exec-282):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@6d37b108] for JPA transaction
2012-04-07 07:54:16,541 DEBUG AuthenticationHandler:("http-bio-7006"-exec-302):35 - Authentication handler : Heartbeat User~dummy_heartbeat
2012-04-07 07:54:16,542 DEBUG AuthenticationHandler:("http-bio-7006"-exec-302):45 - Service Id : 1333765456542-/xxx-framework-service/platform/skus/5500
2012-04-07 07:54:16,543 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-302):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:16,543 DEBUG JpaTransactionManager:("http-bio-7006"-exec-302):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.findById]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT,readOnly; ''
2012-04-07 07:54:16,543 DEBUG JpaTransactionManager:("http-bio-7006"-exec-302):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@7a973a0f] for JPA transaction
2012-04-07 07:54:19,708 DEBUG AuthenticationHandler:("http-bio-7006"-exec-236):35 - Authentication handler : System~System
2012-04-07 07:54:19,709 DEBUG AuthenticationHandler:("http-bio-7006"-exec-236):45 - Service Id : 1333765459709-/xxx-framework-service/platform/skus/get_sku_details
2012-04-07 07:54:19,710 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-236):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:19,710 DEBUG JpaTransactionManager:("http-bio-7006"-exec-236):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getSkuDetails]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:54:19,711 DEBUG JpaTransactionManager:("http-bio-7006"-exec-236):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@342f22a9] for JPA transaction
2012-04-07 07:54:22,648 DEBUG AuthenticationHandler:("http-bio-7006"-exec-292):35 - Authentication handler : System~System
2012-04-07 07:54:22,649 DEBUG AuthenticationHandler:("http-bio-7006"-exec-292):45 - Service Id : 1333765462649-/xxx-framework-service/platform/skus/get_avaliable_count
2012-04-07 07:54:22,650 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-292):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:22,650 DEBUG JpaTransactionManager:("http-bio-7006"-exec-292):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getAvaliableCountOpt]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:54:22,650 DEBUG JpaTransactionManager:("http-bio-7006"-exec-292):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@5e5d2643] for JPA transaction
2012-04-07 07:54:25,751 DEBUG AuthenticationHandler:("http-bio-7006"-exec-291):35 - Authentication handler : System~System
2012-04-07 07:54:25,752 DEBUG AuthenticationHandler:("http-bio-7006"-exec-291):45 - Service Id : 1333765465752-/xxx-framework-service/platform/skus/get_sku_details
2012-04-07 07:54:25,753 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-291):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:25,753 DEBUG JpaTransactionManager:("http-bio-7006"-exec-291):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getSkuDetails]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:54:25,754 DEBUG JpaTransactionManager:("http-bio-7006"-exec-291):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@28c7c70b] for JPA transaction
2012-04-07 07:54:30,811 DEBUG AuthenticationHandler:("http-bio-7006"-exec-234):35 - Authentication handler : System~System
2012-04-07 07:54:30,812 DEBUG AuthenticationHandler:("http-bio-7006"-exec-234):45 - Service Id : 1333765470812-/xxx-framework-service/platform/skus/get_avaliable_count
2012-04-07 07:54:30,813 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-234):242 - Returning cached instance of singleton bean 'transactionManager'
2012-04-07 07:54:30,814 DEBUG JpaTransactionManager:("http-bio-7006"-exec-234):365 - Creating new transaction with name [com.xxx.framework.manager.SkuServiceManager.getAvaliableCountOpt]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2012-04-07 07:54:30,814 DEBUG JpaTransactionManager:("http-bio-7006"-exec-234):323 - Opened new EntityManager [org.hibernate.ejb.EntityManagerImpl@140768d6] for JPA transaction
2012-04-07 07:54:33,790 DEBUG AuthenticationHandler:("http-bio-7006"-exec-301):35 - Authentication handler : System~System
2012-04-07 07:54:33,790 DEBUG AuthenticationHandler:("http-bio-7006"-exec-301):45 - Service Id : 1333765473790-/xxx-framework-service/platform/skus/get_sku_details
2012-04-07 07:54:33,792 DEBUG DefaultListableBeanFactory:("http-bio-7006"-exec-301):242 - Returning cached instance of singleton bean 'transactionManager'

并且这种事务的创建一直在发生,直到服务器达到 200 个线程(需要 1-2 分钟)并且变得无响应。我在这段时间检查了 MySQL 连接,并且计数远低于最大连接限制。

我们也进行了线程转储,但我不确定这到底是什么可能指向这样的堆积或线程。

任何进一步调试此问题的想法都非常受欢迎。

4

2 回答 2

3

Ok, so after analyzing thread dump we realized that somehow many threads were waiting on awaitAvailable() method of c3p0 library which manages DB connections in our application. We found a discussion where somebody suggested to upgrade c3p0 library(from 0.9.1.2 to 0.9.2) because old library had this issue when lot of concurrent connections are checked out of pool, the connection counter value remains bloated after check-ins and hence doesn't allow to check out new connections and results in processes to wait for connections perpetually (we had checkout timeout = 0).

But even after upgrading the library we faced this issue so we used debugUnreturnedConnectionStackTraces property to see what's happening to connections and why are they unavailable. After a day of logs we realized that our code which was leaking DB connections managed by c3p0. There were few flows in our code where we were not calling rollback() specifically in catch block before closing entity Manager in finally block, so in case of exceptions connection was not coming back to pool and if frequency of exception is high(more than size of pool within timeout interval) then all other process threads would pile up to get connection.

Since that code is fixed, we haven't seen any such thread pileup problems.

于 2012-07-04T18:09:47.307 回答
2

线程转储分析是前进的方向。这个问题的答案应该可以帮助你找到根本原因: 线程转储分析工具/方法

于 2012-04-07T09:43:58.733 回答