php - 解析提取一些值但不是全部的 CSV 文件

Question

再会，

我有一个本地 csv 文件，其中包含每天更改的值，称为 DailyValues.csv
我需要提取 category2 和 category4 的值字段。
然后从提取的值中组合、排序和删除重复项（如果有）。
然后将其保存到新的本地文件 NewValues.txt。

以下是 DailyValues.csv 文件的示例：

category,date,value  
category1,2010-05-18,value01  
category1,2010-05-18,value02  
category1,2010-05-18,value03  
category1,2010-05-18,value04  
category1,2010-05-18,value05  
category1,2010-05-18,value06  
category1,2010-05-18,value07  
category2,2010-05-18,value08  
category2,2010-05-18,value09  
category2,2010-05-18,value10  
category2,2010-05-18,value11  
category2,2010-05-18,value12  
category2,2010-05-18,value13  
category2,2010-05-18,value14  
category2,2010-05-18,value30  
category3,2010-05-18,value16  
category3,2010-05-18,value17  
category3,2010-05-18,value18  
category3,2010-05-18,value19  
category3,2010-05-18,value20  
category3,2010-05-18,value21  
category3,2010-05-18,value22  
category3,2010-05-18,value23  
category3,2010-05-18,value24  
category4,2010-05-18,value25  
category4,2010-05-18,value26  
category4,2010-05-18,value10  
category4,2010-05-18,value28  
category4,2010-05-18,value11  
category4,2010-05-18,value30  
category2,2010-05-18,value31  
category2,2010-05-18,value32  
category2,2010-05-18,value33  
category2,2010-05-18,value34  
category2,2010-05-18,value35  
category2,2010-05-18,value07

我在http://www.php.net/manual/en/function.fgetcsv.php找到了一些有用的解析示例，并设法提取了 value 列的所有值，但不知道如何将其限制为仅提取 category2/4 的值，然后排序并清除重复项。

解决方案需要在 php、perl 或 shell 脚本中。

任何帮助将非常感激。
先感谢您。

score 0 · Accepted Answer

这是一个shell脚本解决方案。

egrep 'category4|category2' input.file | cut -d"," -f1,3 | sort -u > output.file

我使用该cut命令只是为了向您展示您只能提取某些列，因为fcut 开关选择了您要提取的列。

u排序开关使输出是唯一的。

编辑：使用egrepand not很重要grep，因为grep使用了一些受限制的正则表达式集，而 egrep 有一些进一步的设施

编辑（对于只有 grep 可用的人）：

grep 'category2' input.file > temp.file && grep 'category4' input.file >> temp.file && cut temp.file -d"," -f1,3 | sort -u > output.file && rm temp.file

它产生了相当大的开销，但仍然有效......

php - 解析提取一些值但不是全部的 CSV 文件

1 回答 1

Related

Reference