我正在将 CSV 文件中的标题保存到数据库中。
在 Ubuntu 上用 less 查看文件的开头是这样的:
Date,Supermarket,Speciality,Takeaway,Caf<E9>/restaurant
1/06/2019,0.039175903,-0.01496395,0.03603785,0.029072835
1/07/2019,0.039399919,-0.008250166,0.022385733,0.015478668
标题数据为 ($csvHeader)
Array
(
[0] => Date
[1] => Supermarket
[2] => Speciality
[3] => Takeaway
[4] => Caf�/restaurant
)
ord(substr($csvHeader,3,1)) === 233
这是使用以下函数读取的
protected function getCsvHeaders()
{
$fh = fopen( $this->getCsvPath(), 'r+' );
$firstrow = fgetcsv( $fh );
fclose( $fh );
return $firstrow;
}
这被保存到表 DataConfiguration 中:
$dataConf
->setColumns(serialize($csvHeader));
设置为 utf8mb4:
show create table data_configuration;
+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_configuration | CREATE TABLE `data_configuration` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data_set_id` int(11) NOT NULL,
`file_type_id` int(11) NOT NULL,
`columns` varchar(7500) COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_idx` (`data_set_id`,`file_type_id`),
KEY `IDX_54A0B1FD70053C01` (`data_set_id`),
KEY `IDX_54A0B1FD9E2A35A8` (`file_type_id`),
CONSTRAINT `FK_54A0B1FD70053C01` FOREIGN KEY (`data_set_id`) REFERENCES `data_set` (`id`),
CONSTRAINT `FK_54A0B1FD9E2A35A8` FOREIGN KEY (`file_type_id`) REFERENCES `file_type` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=13176 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
教义似乎也为 utf8mb4 配置:
doctrine:
dbal:
# configure these for your database server
driver: 'pdo_mysql'
# server_version: '5.7'
charset: utf8mb4
default_table_options:
charset: utf8mb4
collate: utf8mb4_unicode_ci
url: '%env(resolve:DATABASE_URL)%'
options:
1001: true
然而,数据在 utf8 字符处被截断,随后的反序列化失败。我可以在我的 Ubuntu 18/AWS RDS 环境以及我的本地 MacOS/Brew 环境中重现这一点。
我可以探索哪些其他途径来解决这个问题?