0

我正在解析一个 CSV 文件fgetcsv。我从 Magento 安装中获得了 CSV 导出。但是,它是不可解析的。这是此类导出的一个有问题的行:

200000,Samsung Galaxy S2,$399.00,8806085359376,null,免陆运,新品,有现货,Samsung,"Vivid‧Fast‧Slim 全新 GALAXY SII Plus 让您的生活更智能!4.3" SUPER AMOLED Plus 4.3" SUPER AMOLED Plus 显示屏比已经卓越的 SUPER AMOLED 更进一步,提供增强的可读性、更纤薄的设计和更好的电池消耗,为任何智能手机提供最佳观看价值。全触控显示屏尺寸:4.3 英寸分辨率:480 x 800 像素平台操作平台: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 用户界面 (最多 7 页小部件桌面) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ 电池容量: 1650mAh",手机 > 制造商 > 三星,

问题是"在文件中用作速记inch和其他场合。

我正在寻找一个正则表达式来preg_replace处理每个不带逗号或不带逗号的双引号。但是,我的 RegEx 知识很差,我无法创建工作表达式。这是我认为非常接近解决方案的方法,但我无法使其工作:

private static function _fixQuotesInString($string)
{
    return preg_replace('/(?<!,)"|"(?!,)/', '&quot;', $string);
}

由于我以我有限的知识阅读它,我会说:如果您发现双引号,请检查它是否前面没有逗号或后面没有逗号,如果是,请将其替换为“。但是,经验表明,事实并非如此。

当您发布解决方案时,如果您可以添加 RegEx 的“口头解释”,那就太好了,这样我就可以掌握它。

4

2 回答 2

4

您的正则表达式将同时替换两者,"",因为两者都不同时满足两个交替条件。相反,您可以使用(?<!,)"(?!,)which 要求引号两边用逗号括起来。

请注意,如果合法地跟在逗号后面,解决方案仍然存在潜在问题",因此您应该考虑从源头解决此问题。

于 2013-06-03T12:51:26.020 回答
3

描述

如果您想简单地解析每个逗号分隔的字段,这些字段可能被双引号包围,也可能不被双引号包围,您可以使用这个正则表达式:

(?:^|,)("?)(.*?)\1(?=,(?!\s)|$)

在此处输入图像描述

第 2 组被分配了每个逗号分隔的值。如果该值由引号打开,则需要一个右引号后跟,一个空格,或者需要行尾来关闭字符串。

PHP 代码示例:

<?php
$sourcestring="your source string";
preg_match_all('/(?:^|,)("?)(.*?)\1(?=,|$)/ims',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>

$matches Array:
(
    [0] => Array
        (
            [0] => 200000
            [1] => ,Samsung Galaxy S2
            [2] => ,$399.00
            [3] => ,8806085359376
            [4] => ,null
            [5] => ,Free ground shipping
            [6] => ,New
            [7] => ,In Stock
            [8] => ,Samsung
            [9] => ,"Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3" SUPER AMOLED Plus The 4.3" SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3" Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh"
            [10] => ,Mobile > Manufacturer > Samsung
            [11] => ,
        )

    [1] => Array
        (
            [0] => 
            [1] => 
            [2] => 
            [3] => 
            [4] => 
            [5] => 
            [6] => 
            [7] => 
            [8] => 
            [9] => "
            [10] => 
            [11] => 
        )

    [2] => Array
        (
            [0] => 200000
            [1] => Samsung Galaxy S2
            [2] => $399.00
            [3] => 8806085359376
            [4] => null
            [5] => Free ground shipping
            [6] => New
            [7] => In Stock
            [8] => Samsung
            [9] => Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3" SUPER AMOLED Plus The 4.3" SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3" Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh
            [10] => Mobile > Manufacturer > Samsung
            [11] => 
        )

)

简单替换

因为您的源文本是逗号分隔的,并且逗号分隔符没有任何周围的空间来解决"excellent occasion, 4.3", samsung"您可以使用的问题

正则表达式:(?<!,)(")(?!,\S) 替换为空

PHP 代码示例:

<?php
$sourcestring="your source string";
echo preg_replace('/(?<!,)(")(?!,\S)/ims','',$sourcestring);
?>

$sourcestring after replacement:
200000,Samsung Galaxy S2,$399.00,8806085359376,null,Free ground shipping,New,In Stock,Samsung,"Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3 SUPER AMOLED Plus The 4.3 SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3 Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh",Mobile > Manufacturer > Samsung,
于 2013-06-03T15:11:16.350 回答