matlab - 如何将复杂的csv文件导入到Matlab中的数值向量中

Question

我想知道我们应该如何从包含字符串、双精度和字符等的复杂 csv 文件中读取数据。

例如，您能否提供一个成功的命令，可以在此 csv 文件中提取数值？

点击这里。

例如：

yield curve data 2013-10-04     
Yields in percentages per annum.        


Parameters - AAA-rated bonds        
Series key   Parameters  Description
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0  2.03555 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 0 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA1  -2.009068   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA2  24.54184    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 2 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA3  -21.80556   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 3 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU1   5.351378    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU2   4.321162    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 2 - Euro, provided by ECB

这些是文件中信息的一部分。我试图csvread('yc_latest.csv', 6, 1, [6,1,6,1])获得值 2.03555，但它给了我以下错误：

   Error using dlmread (line 139)
    Mismatch between file and format string.
    Trouble reading number from file (row 1u, field 3u) ==> "Euro area (changing composition) -
    Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous
    compounding - yield error minimisation - Yield curve parameters, Beta 0

    Error in csvread (line 50)
        m=dlmread(filename, ',', r, c, rng);

score 5 · Accepted Answer

我强烈建议您使用 matlab 中的“导入数据”功能（它位于“HOME”工具栏中）。

特别注意屏幕截图中它还可以为您生成代码，以便您将来可以自动化它。在此处输入图像描述

score 2 · Accepted Answer

这是一个非常hacky的解决方案。不幸的是，Matlab 在读取 csv 文件方面非常糟糕，这使得这种黑客技术成为一种不幸的必需品。从好的方面来说，您可能只需要编写一次这种代码。

fid = fopen('yc_latest.csv');   %// open the file

%// parse as csv, skipping the first six lines
contents = textscan(fid, '%s %f %[^\n]', 'HeaderLines', 6); 

%// unpack the fields and give them meaningful names
[seriesKey, parameters, description]   = contents{:};

fclose(fid);                    %// don't forget this!

score 0 · Accepted Answer

克里斯解决方案的替代方案：

fid=fopen('yc_latest.csv');
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary cell array with the rows
fclose(fid);

%looks for the lines with a euro value:
value=strfind(Rows,'Euro'); 
Idx = find(~cellfun('isempty', value)); 

Columns= cellfun(@(x) textscan(x,'%f','delimiter','\t','CollectOutput',1), Rows);
Columns= cellfun(@transpose, Columns, 'UniformOutput', 0);

具有实际欧元值的所有行的索引都存储在 Idx 中。

score 0 · Accepted Answer

您可能想使用textscan这种方式。

每一行都用常规分隔符（制表符、空格）进行解析，使用的格式是%*s用星号跳过第一个元素（YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0），然后%f得到的值兴趣，最后%*[^\n]跳过剩下的行。

fid = fopen(filename);                                
C = textscan(fid, '%*s%f%*[^\n]', 'HeaderLines', 6); 
fclose(fid);

values   = C{1};

matlab - 如何将复杂的csv文件导入到Matlab中的数值向量中

4 回答 4

Related

Reference