我需要将 1 亿多行从 MySQL 数据库加载到内存中。我的 java 程序失败,java.lang.OutOfMemoryError: Java heap space
我的机器中有 8GB RAM,并且在我的 JVM 选项中给出了 -Xmx6144m。
这是我的代码
public List<Record> loadTrainingDataSet() {
ArrayList<Record> records = new ArrayList<Record>();
try {
Statement s = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY);
s.executeQuery("SELECT movie_id,customer_id,rating FROM ratings");
ResultSet rs = s.getResultSet();
int count = 0;
while (rs.next()) {
知道如何克服这个问题吗?
更新
我遇到了这篇文章,并根据下面的评论更新了我的代码。似乎我能够以相同的 -Xmx6144m 数量将数据加载到内存中,但这需要很长时间。
这是我的代码。
...
import org.apache.mahout.math.SparseMatrix;
...
@Override
public SparseMatrix loadTrainingDataSet() {
long t1 = System.currentTimeMillis();
SparseMatrix ratings = new SparseMatrix(NUM_ROWS,NUM_COLS);
int REC_START = 0;
int REC_END = 0;
try {
for (int i = 1; i <= 101; i++) {
long t11 = System.currentTimeMillis();
REC_END = 1000000 * i;
Statement s = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);
s.setFetchSize(Integer.MIN_VALUE);
ResultSet rs = s.executeQuery("SELECT movie_id,customer_id,rating FROM ratings LIMIT " + REC_START + "," + REC_END);//100480507
while (rs.next()) {
int movieId = rs.getInt("movie_id");
int customerId = rs.getInt("customer_id");
byte rating = (byte) rs.getInt("rating");
ratings.set(customerId,movieId,rating);
}
long t22 = System.currentTimeMillis();
System.out.println("Round " + i + " completed " + (t22 - t11) / 1000 + " seconds");
rs.close();
s.close();
}
} catch (Exception e) {
System.err.println("Cannot connect to database server " + e);
} finally {
if (conn != null) {
try {
conn.close();
System.out.println("Database connection terminated");
} catch (Exception e) { /* ignore close errors */ }
}
}
long t2 = System.currentTimeMillis();
System.out.println(" Took " + (t2 - t1) / 1000 + " seconds");
return ratings;
}
加载前 100,000 行需要 2 秒。加载第 29 个 100,000 行需要 46 秒。我在中间停止了这个过程,因为它花费了太多时间。这些时间是可以接受的吗?有没有办法提高这段代码的性能?我在 8GB RAM 64 位 Windows 机器上运行它。