我正在使用php-resque解析和验证大文件中的数据,然后将该数据导入 mysql 数据库。
我已经知道 LOAD DATA INFILE 可用于将文本文件中的行读取到表中,但不执行任何验证。
我的数据库结构:
项目文件表:
id filename fileepath valid_items invalid_items processed_items processed
物品表:
id uid item file_id created_at
我的 Resque 工作类如下所示:
php-resque fork 一个子进程并实例化 ItemsFileProcessor 类然后
- setUp() 被调用
- perform() 被调用
/**
* Read and validate items form a file, and store them in a database.
*/
class ItemsFileProcessor {
//ItemsFile Model instance
private $items_file = null;
//Item Model instance
private $item = null;
//retrieved from ItemsFile table.
private $file = null;
public function __construct() {
$this->items_file = new ItemsFile();
$this->item = new Item();
}
public function setUp() {
if (isset($this->args['file_id'])) {
//get file from ItemsFile Table by id.
$this->file = $this->items_file->getFile($this->args['file_id']);
if (empty($this->file)) {
//End job processing if file does not exist.
exit(-1);
}
}
}
function perform() {
//NodeJs, socket.io, redis, broadcasting system
EventBroadcaster::broadcast('app-jobs-channel', 'file_processing_started');
$processed_items = 0;
$valid_items = 0;
$invalid_items = 0;
//item validation class instance
$item_validator = new ItemValidator();
try {
$tmp_file = new SplFileObject($this->file->filepath);
//Read items from file, and validate each item.
while ($tmp_file->valid()) {
$line = trim($tmp_file->fgets());
if ($line !== '') {
if ($item_validator->isValid($line, new ItemValidationRule())) {
//store item in Item table.
$this->item->create([
'uid' => 'foo',
'item' => $line,
'file_id' => $this->file->id,
]);
$valid_items++;
} else {
$invalid_items++;
}
$processed_items++;
}
}
//update ItemsFile Table record
$this->items_file->update(
$this->file->id,
[
'processed_items' => $processed_items,
'valid_items' => $valid_items,
'invalid_items' => $invalid_items,
'processed' => 'Processed',
]
);
EventBroadcaster::broadcast('app-jobs-channel', 'file_processing_completed');
} catch (LogicException $exception) {
//broadcast failure.
EventBroadcaster::broadcast('app-jobs-channel', 'file_processing_failed');
Logger::getInstance()->log('ProcessContactFile Exception: '.$exception->getMessage(), Logger::LOGTYPE_ERROR);
exit(-1);
}
}
}
我的问题:
- 处理文件花费的时间太长
- Mysql 必须一一处理所有的插入请求。LOAD DATA INFILE 要快得多。
我的问题:
有没有办法优化这个或者可能以某种方式引入 LOAD DATA INFILE 。