android - 使用 applyBatch 插入数千个联系人条目很慢

Question

我正在开发一个需要插入大量联系人条目的应用程序。目前大约有 600 个联系人，总共有 6000 个电话号码。最大的联系人有 1800 个电话号码。

截至今天的状态是我已经创建了一个自定义帐户来保存联系人，因此用户可以选择在联系人视图中查看联系人。

但是触点的插入速度非常缓慢。我使用 ContentResolver.applyBatch 插入联系人。我尝试了不同大小的 ContentProviderOperation 列表（100、200、400），但总运行时间约为。相同。插入所有联系人和号码大约需要 30 分钟！

我发现的大多数关于 SQlite 中缓慢插入的问题都会引发事务。但由于我使用 ContentResolver.applyBatch 方法，我无法控制它，我会假设 ContentResolver 会为我处理事务管理。

所以，我的问题是：我做错了什么，或者我能做些什么来加快速度？

安德斯

编辑： @jcwenger：哦，我明白了。很好的解释！

所以我必须先插入 raw_contacts 表，然后插入带有名称和数字的数据表。我将失去的是对我在 applyBatch 中使用的 raw_id 的反向引用。

所以我必须获取新插入的 raw_contacts 行的所有 id 以用作数据表中的外键？

score 51 · Accepted Answer

使用ContentResolver.bulkInsert (Uri url, ContentValues[] values)代替ApplyBatch()

ApplyBatch (1) 使用事务和 (2) 它为整个批次锁定一次 ContentProvider，而不是每次操作锁定/解锁一次。正因为如此，它比一次做一个（非批处理）要快一些。

但是，由于批处理中的每个操作都可以有不同的 URI 等，因此存在大量开销。“哦，一个新的操作！我想知道它在什么表中......在这里，我将插入一行......哦，一个新的操作！我想知道它在什么表中......” 无限。由于将 URI 转换为表的大部分工作都涉及大量字符串比较，因此显然非常慢。

相比之下，bulkInsert 将一大堆值应用于同一个表。它说，“批量插入......找到表，好吧，插入！插入！插入！插入！插入！” 快多了。

当然，它需要您的 ContentResolver 有效地实现 bulkInsert。大多数都这样做，除非你自己写，在这种情况下需要一些编码。

score 10 · Accepted Answer

bulkInsert：对于那些感兴趣的人，这是我能够试验的代码。注意我们如何避免分配 int/long/floats :) 这可以节省更多时间。

private int doBulkInsertOptimised(Uri uri, ContentValues values[]) {
    long startTime = System.currentTimeMillis();
    long endTime = 0;
    //TimingInfo timingInfo = new TimingInfo(startTime);

    SQLiteDatabase db = mOpenHelper.getWritableDatabase();

    DatabaseUtils.InsertHelper inserter =
        new DatabaseUtils.InsertHelper(db, Tables.GUYS); 

    // Get the numeric indexes for each of the columns that we're updating
    final int guiStrColumn = inserter.getColumnIndex(Guys.STRINGCOLUMNTYPE);
    final int guyDoubleColumn = inserter.getColumnIndex(Guys.DOUBLECOLUMNTYPE);
//...
    final int guyIntColumn = inserter.getColumnIndex(Guys.INTEGERCOLUMUNTYPE);

    db.beginTransaction();
    int numInserted = 0;
    try {
        int len = values.length;
        for (int i = 0; i < len; i++) {
            inserter.prepareForInsert();

            String guyID = (String)(values[i].get(Guys.GUY_ID)); 
            inserter.bind(guiStrColumn, guyID);


            // convert to double ourselves to save an allocation.
            double d = ((Number)(values[i].get(Guys.DOUBLECOLUMNTYPE))).doubleValue();
            inserter.bind(guyDoubleColumn, lat);


            // getting the raw Object and converting it int ourselves saves
            // an allocation (the alternative is ContentValues.getAsInt, which
            // returns a Integer object)

            int status = ((Number) values[i].get(Guys.INTEGERCOLUMUNTYPE)).intValue();
            inserter.bind(guyIntColumn, status);

            inserter.execute();
        }
        numInserted = len;
        db.setTransactionSuccessful();
    } finally {
        db.endTransaction();
        inserter.close();

        endTime = System.currentTimeMillis();

        if (LOGV) {
            long timeTaken = (endTime - startTime);
            Log.v(TAG, "Time taken to insert " + values.length + " records was " + timeTaken + 
                    " milliseconds " + " or " + (timeTaken/1000) + "seconds");
        }
    }
    getContext().getContentResolver().notifyChange(uri, null);
    return numInserted;
}

score 2 · Accepted Answer

2

可以在此处找到有关如何覆盖的示例，bulkInsert()以加快多次插入的速度

于 2011-12-22T10:51:52.343 回答

score 1 · Accepted Answer

@jcwenger 起初，在阅读您的帖子后，我认为这是 bulkInsert 比 ApplyBatch 更快的原因，但在阅读了 Contact Provider 的代码后，我不这么认为。1.你说ApplyBatch使用事务，是的，但是bulkInsert也使用事务。这是它的代码：

public int bulkInsert(Uri uri, ContentValues[] values) {
    int numValues = values.length;
    mDb = mOpenHelper.getWritableDatabase();
    mDb.beginTransactionWithListener(this);
    try {
        for (int i = 0; i < numValues; i++) {
            Uri result = insertInTransaction(uri, values[i]);
            if (result != null) {
                mNotifyChange = true;
            }
            mDb.yieldIfContendedSafely();
        }
        mDb.setTransactionSuccessful();
    } finally {
        mDb.endTransaction();
    }
    onEndTransaction();
    return numValues;
}

也就是说，bulkInsert 也使用了转换。所以我认为不是这个原因。2.你说bulkInsert将一大堆值应用于同一张表。对不起，我在froyo的源代码中找不到相关代码。我想知道你是怎么找到的？你能告诉我吗？

我认为的原因是：

bulkInsert 使用 mDb.yieldIfContendedSafely() 而 applyBatch 使用 mDb.yieldIfContendedSafely(SLEEP_AFTER_YIELD_DELAY)/*SLEEP_AFTER_YIELD_DELAY = 4000*/

看了SQLiteDatabase.java的代码后发现，如果在yieldIfContendedSafely中设置时间，它会休眠，但是如果不设置时间，它就不会休眠。你可以参考下面的代码SQLiteDatabase.java 的一段代码

private boolean yieldIfContendedHelper(boolean checkFullyYielded, long     sleepAfterYieldDelay) {
    if (mLock.getQueueLength() == 0) {
        // Reset the lock acquire time since we know that the thread was willing to yield
        // the lock at this time.
        mLockAcquiredWallTime = SystemClock.elapsedRealtime();
        mLockAcquiredThreadTime = Debug.threadCpuTimeNanos();
        return false;
    }
    setTransactionSuccessful();
    SQLiteTransactionListener transactionListener = mTransactionListener;
    endTransaction();
    if (checkFullyYielded) {
        if (this.isDbLockedByCurrentThread()) {
            throw new IllegalStateException(
                    "Db locked more than once. yielfIfContended cannot yield");
        }
    }
    if (sleepAfterYieldDelay > 0) {
        // Sleep for up to sleepAfterYieldDelay milliseconds, waking up periodically to
        // check if anyone is using the database.  If the database is not contended,
        // retake the lock and return.
        long remainingDelay = sleepAfterYieldDelay;
        while (remainingDelay > 0) {
            try {
                Thread.sleep(remainingDelay < SLEEP_AFTER_YIELD_QUANTUM ?
                        remainingDelay : SLEEP_AFTER_YIELD_QUANTUM);
            } catch (InterruptedException e) {
                Thread.interrupted();
            }
            remainingDelay -= SLEEP_AFTER_YIELD_QUANTUM;
            if (mLock.getQueueLength() == 0) {
                break;
            }
        }
    }
    beginTransactionWithListener(transactionListener);
    return true;
}

我认为这就是 bulkInsert 比 applyBatch 更快的原因。

有任何问题请联系我。

score 1 · Accepted Answer

我得到了你的基本解决方案，在批量操作中使用“屈服点” 。

使用批处理操作的另一面是，大批量操作可能会长时间锁定数据库，从而阻止其他应用程序访问数据并可能导致 ANR（“应用程序无响应”对话框。）

为避免这种数据库锁定，请确保在批处理中插入“屈服点”。让步点向内容提供者表明，在执行下一个操作之前，它可以提交已经进行的更改、让步给其他请求、打开另一个事务并继续处理操作。

屈服点不会自动提交事务，但只有在数据库上有另一个请求等待时。通常，同步适配器应在批处理中每个原始联系操作序列的开头插入一个屈服点。请参阅withYieldAllowed(boolean)。

我希望它可能对你有用。

score 1 · Accepted Answer

这是在 30 秒内插入相同数据量的示例。

 public void testBatchInsertion() throws RemoteException, OperationApplicationException {
    final SimpleDateFormat FORMATTER = new SimpleDateFormat("mm:ss.SSS");
    long startTime = System.currentTimeMillis();
    Log.d("BatchInsertionTest", "Starting batch insertion on: " + new Date(startTime));

    final int MAX_OPERATIONS_FOR_INSERTION = 200;
    ArrayList<ContentProviderOperation> ops = new ArrayList<>();
    for(int i = 0; i < 600; i++){
        generateSampleProviderOperation(ops);
        if(ops.size() >= MAX_OPERATIONS_FOR_INSERTION){
            getContext().getContentResolver().applyBatch(ContactsContract.AUTHORITY,ops);
            ops.clear();
        }
    }
    if(ops.size() > 0)
        getContext().getContentResolver().applyBatch(ContactsContract.AUTHORITY,ops);
    Log.d("BatchInsertionTest", "End of batch insertion, elapsed: " + FORMATTER.format(new Date(System.currentTimeMillis() - startTime)));

}
private void generateSampleProviderOperation(ArrayList<ContentProviderOperation> ops){
    int backReference = ops.size();
    ops.add(ContentProviderOperation.newInsert(ContactsContract.RawContacts.CONTENT_URI)
            .withValue(ContactsContract.RawContacts.ACCOUNT_NAME, null)
            .withValue(ContactsContract.RawContacts.ACCOUNT_TYPE, null)
            .withValue(ContactsContract.RawContacts.AGGREGATION_MODE, ContactsContract.RawContacts.AGGREGATION_MODE_DISABLED)
            .build()
    );
    ops.add(ContentProviderOperation.newInsert(ContactsContract.Data.CONTENT_URI)
                    .withValueBackReference(ContactsContract.Data.RAW_CONTACT_ID, backReference)
                    .withValue(ContactsContract.Data.MIMETYPE, ContactsContract.CommonDataKinds.StructuredName.CONTENT_ITEM_TYPE)
                    .withValue(ContactsContract.CommonDataKinds.StructuredName.GIVEN_NAME, "GIVEN_NAME " + (backReference + 1))
                    .withValue(ContactsContract.CommonDataKinds.StructuredName.FAMILY_NAME, "FAMILY_NAME")
                    .build()
    );
    for(int i = 0; i < 10; i++)
        ops.add(ContentProviderOperation.newInsert(ContactsContract.Data.CONTENT_URI)
                        .withValueBackReference(ContactsContract.Data.RAW_CONTACT_ID, backReference)
                        .withValue(ContactsContract.Data.MIMETYPE, ContactsContract.CommonDataKinds.Phone.CONTENT_ITEM_TYPE)
                        .withValue(ContactsContract.CommonDataKinds.Phone.TYPE, ContactsContract.CommonDataKinds.Phone.TYPE_MAIN)
                        .withValue(ContactsContract.CommonDataKinds.Phone.NUMBER, Integer.toString((backReference + 1) * 10 + i))
                        .build()
        );
}

日志：02-17 12:48:45.496 2073-2090/com.vayosoft.mlab D/BatchInsertionTest﹕开始批量插入时间：2 月 17 日星期三 12:48:45 GMT+02:00 2016 02-17 12:49： 16.446 2073-2090/com.vayosoft.mlab D/BatchInsertionTest﹕批次插入结束，已过：00:30.951

score 0 · Accepted Answer

仅供本帖读者参考。

即使使用 applyBatch()，我也面临性能问题。就我而言，其中一张表上写有数据库触发器。我删除了表的触发器及其繁荣。现在我的应用程序以快速的速度插入行。

android - 使用 applyBatch 插入数千个联系人条目很慢

7 回答 7

Related

Reference