4

数据

假设以下数据格式(第一行有标题行,500+ 行):

1, "<LastName> ,<Title>. <FirstName>", <Gender>, 99.9

我的代码

我试过这个(忽略:见下面的编辑)

[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1);

错误

...并收到以下错误消息

error: textread: A(I): index out of bounds; value 1 out of bound 0
error: called from: 
error:   C:\Program Files\Octave\Octave3.6.2_gcc4.6.2\share\octave\3.6.2\m\io\textread.m at line 75, column 3

问题:

  • 给定文本限定符(以及嵌入在“名称”字符串中的逗号),我的格式字符串是否不正确?
  • 我什至使用正确的方法将 CSV 加载到 MATLAB\Octave 中吗?

编辑

我忘记了分隔符(错误消息在 strread.m 的不同行返回失败):

[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1, 'delimiter', ',');
4

1 回答 1

0

我这样做了,但是它将名称字段的文本限定字符串拆分为两个单独的字段,因此在字符串中包含字段分隔符的任何文本限定字段都将创建一个额外的输出列(我仍然很想知道为什么%q 格式不适用于该字段-> 可能是空格?):

% Begin CSV Import ============================================================================

    % strrep is used to strip the text qualifier out of each row. This is wrapped around the
    % call to textread, which brings the comma delimited data in row-by-row, and skips the 1st row,
    % which holds column field names.
    tic;
    data = strrep(
                    textread(
                                'file.csv'          % File name within current working directory
                                ,'%s'               % Each row is a single string
                                ,'delimiter', '\n'  % Each new row is delimited by the newline character
                                ,'headerlines', 1   % Skip importing the first n rows
                            )
                    ,'"'
                    ,''
                );

    for i = 1:length(data)
        delimpos = findstr(data{i}, ",");

        start = 1;
        for j = 1:length(delimpos) + 1,

            if j < length(delimpos) + 1,
                csvfile{i,j} = data{i}(start:delimpos(j) - 1);
                start = delimpos(j) + 1;
            else
                csvfile{i,j} = data{i}(start:end);
            end

        end
    end

    % Return summary information to user
    printf('\nCSV load completed in -> %f seconds\nm rows returned = %d\nn columns = %d\n', toc, size(csvfile)(1), size(csvfile)(2));
于 2013-05-09T06:44:27.073 回答