2

我想知道我们应该如何从包含字符串、双精度和字符等的复杂 csv 文件中读取数据。

例如,您能否提供一个成功的命令,可以在此 csv 文件中提取数值?

点击这里

例如:

yield curve data 2013-10-04     
Yields in percentages per annum.        


Parameters - AAA-rated bonds        
Series key   Parameters  Description
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0  2.03555 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 0 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA1  -2.009068   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA2  24.54184    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 2 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA3  -21.80556   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 3 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU1   5.351378    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU2   4.321162    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 2 - Euro, provided by ECB

这些是文件中信息的一部分。我试图csvread('yc_latest.csv', 6, 1, [6,1,6,1])获得值 2.03555,但它给了我以下错误:

   Error using dlmread (line 139)
    Mismatch between file and format string.
    Trouble reading number from file (row 1u, field 3u) ==> "Euro area (changing composition) -
    Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous
    compounding - yield error minimisation - Yield curve parameters, Beta 0

    Error in csvread (line 50)
        m=dlmread(filename, ',', r, c, rng);
4

4 回答 4

5

我强烈建议您使用 matlab 中的“导入数据”功能(它位于“HOME”工具栏中)。

特别注意屏幕截图中它还可以为您生成代码,以便您将来可以自动化它。 在此处输入图像描述

于 2013-10-08T08:25:44.880 回答
2

这是一个非常hacky的解决方案。不幸的是,Matlab 在读取 csv 文件方面非常糟糕,这使得这种黑客技术成为一种不幸的必需品。从好的方面来说,您可能只需要编写一次这种代码。

fid = fopen('yc_latest.csv');   %// open the file

%// parse as csv, skipping the first six lines
contents = textscan(fid, '%s %f %[^\n]', 'HeaderLines', 6); 

%// unpack the fields and give them meaningful names
[seriesKey, parameters, description]   = contents{:};

fclose(fid);                    %// don't forget this!
于 2013-10-07T15:03:55.413 回答
0

克里斯解决方案的替代方案:

fid=fopen('yc_latest.csv');
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary cell array with the rows
fclose(fid);

%looks for the lines with a euro value:
value=strfind(Rows,'Euro'); 
Idx = find(~cellfun('isempty', value)); 

Columns= cellfun(@(x) textscan(x,'%f','delimiter','\t','CollectOutput',1), Rows);
Columns= cellfun(@transpose, Columns, 'UniformOutput', 0);

具有实际欧元值的所有行的索引都存储在 Idx 中。

于 2013-10-07T16:05:57.420 回答
0

您可能想使用textscan这种方式。

每一行都用常规分隔符(制表符、空格)进行解析,使用的格式是%*s用星号跳过第一个元素(YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0),然后%f得到的值兴趣,最后%*[^\n]跳过剩下的行。

fid = fopen(filename);                                
C = textscan(fid, '%*s%f%*[^\n]', 'HeaderLines', 6); 
fclose(fid);

values   = C{1};
于 2013-10-08T07:37:16.320 回答