java - Google App Engine (Java) - JDO PersistenceManager makePersistentAll 变慢

Question

我有一个迷你 CRM 应用程序。我正在尝试添加功能以允许批量用户导入。上传处理程序从 CSV 文件中读取数据，然后调用我的 CustomerService 类将 Customer 对象存储在数据存储中：

public int createCustomers(final List<Customer> customers) {
    List<List<Customer>> buckets = bucketList(customers);
    int bucketCount = 0;
    PersistenceManager persistenceManager = PMF.get().getPersistenceManager();
    for(List<Customer> bucket: buckets) {
        Collection<Customer> makePersistentAll = persistenceManager.makePersistentAll(bucket);
    }           
    return customers.size();        
}

bucketList 方法只是将一个大列表分解为较小的列表。我这样做是为了尝试调整应用程序并查看 makePersistentAll 调用是否存在最佳大小。我目前将其设置为 1000，并且正在使用包含 100,000 条记录的 CSV 文件进行测试。随着更多记录的添加，该应用程序似乎变得越来越慢，特别是在 60K 记录标记附近。我尝试将 Customer 中的所有字段设置为未编入索引，但这似乎没有任何明显的区别：

@PersistenceCapable
public class Customer implements Serializable {

@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;

@Extension(vendorName="datanucleus", key="gae.unindexed", value="true")
@Persistent
private String accountNumber;
@Extension(vendorName="datanucleus", key="gae.unindexed", value="true")
@Persistent
private String email;
@Extension(vendorName="datanucleus", key="gae.unindexed", value="true")
@Persistent
private String firstName;
@Extension(vendorName="datanucleus", key="gae.unindexed", value="true")
@Persistent
private String lastName;
    ...

我已经在开发（本地）以及生产 App Engine 中对此进行了测试，但无济于事。我认为这是一个比较常见的用例，将大量数据导入系统并快速保存到数据存储区。我已经尝试了很多方法来让它工作： - 使用 AsyncDatastoreService - 一个一个地保存客户对象（makePersistent） - 使用客户中的 Key 对象作为主键 - 使用 accountNumber 字符串作为主键

但似乎没什么大不了的。

score 1 · Accepted Answer

建议您查看http://www.datanucleus.org/products/accessplatform_3_2/jdo/performance_tuning.html特别是关于大量对象的“持久性过程”。您可以减少泵入的对象数量，"makePersistentAll()"以便您有多个调用。显然，GAE/Datastore 可能存在一些奇怪的情况，可能会导致这种情况

java - Google App Engine (Java) - JDO PersistenceManager makePersistentAll 变慢

1 回答 1

Related

Reference