0

我在阅读文件时遇到了麻烦,基本上,我想以某种方式摆脱不必要的文本,只打印出一个只包含数字的矩阵。

1 1 -1 1 1 -1 -1 1 1 1 -1 1

1 -1 1 -1 -1 1 1 1 1 -1 1 1

sgfgdf

1 1 1 -1 1 -1 1 -1 -1 -1 -1 1

rtydsfdsfds

1 -1 1 -1 -1 -1 1 1 -1 -1 -1 1

1 1 -1 1 1 -1 -1 -1 1 -1 1 -1

1 -1 1 1 1 1 1 -1 -1 -1 1 -1

到目前为止我尝试的是:

d = fopen('transmission_data.txt')

R = textscan(d, '%f %f', 'headerLines', 3:5)

fclose(d)

但这不起作用,因为我只需要为 textscan 输入一个数字,例如“3”,这将摆脱前 3 行,但我想特别摆脱第三和第五行。也许还有其他方法可以读取数据?帮助将不胜感激:)

*请注意,第一行文本和第二行数字之间有一个空行

4

2 回答 2

0

这是一种方法:

  1. 使用导入向导
  2. 根据需要选择单元阵列或矩阵
  3. 选择删除具有无效值的行
  4. 点击导入

如果您必须自动化它,您可以让 importwizard 为您生成代码。


这是导入向导生成的(相当冗长的)代码,当我在名为 test.txt 的文件中对您的数据使用它时

%% Import data from text file.
% Script for importing data from the following text file:
%
%    \\invol-vs-fp1\Users$\d.jaheruddin\MATLAB\test.txt
%
% To extend the code to different selected data or a different text file,
% generate a function instead of a script.

% Auto-generated by MATLAB on 2013/10/08 14:39:30

%% Initialize variables.
filename = 'test.txt';
delimiter = ' ';

%% Read columns of data as strings:
% For more information, see the TEXTSCAN documentation.
formatSpec = '%s%s%s%s%s%s%s%s%s%s%s%s%[^\n\r]';

%% Open the text file.
fileID = fopen(filename,'r');

%% Read columns of data according to format string.
% This call is based on the structure of the file used to generate this
% code. If an error occurs for a different file, try regenerating the code
% from the Import Tool.
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'MultipleDelimsAsOne', true,  'ReturnOnError', false);

%% Close the text file.
fclose(fileID);

%% Convert the contents of columns containing numeric strings to numbers.
% Replace non-numeric strings with NaN.
raw = [dataArray{:,1:end-1}];
numericData = NaN(size(dataArray{1},1),size(dataArray,2));

for col=[1,2,3,4,5,6,7,8,9,10,11,12]
    % Converts strings in the input cell array to numbers. Replaced non-numeric
    % strings with NaN.
    rawData = dataArray{col};
    for row=1:size(rawData, 1);
        % Create a regular expression to detect and remove non-numeric prefixes and
        % suffixes.
        regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
        try
            result = regexp(rawData{row}, regexstr, 'names');
            numbers = result.numbers;

            % Detected commas in non-thousand locations.
            invalidThousandsSeparator = false;
            if any(numbers==',');
                thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
                if isempty(regexp(thousandsRegExp, ',', 'once'));
                    numbers = NaN;
                    invalidThousandsSeparator = true;
                end
            end
            % Convert numeric strings to numbers.
            if ~invalidThousandsSeparator;
                numbers = textscan(strrep(numbers, ',', ''), '%f');
                numericData(row, col) = numbers{1};
                raw{row, col} = numbers{1};
            end
        catch me
        end
    end
end


%% Exclude rows with non-numeric cells
J = ~all(cellfun(@(x) (isnumeric(x) || islogical(x)) && ~isnan(x),raw),2); % Find rows with non-numeric cells
raw(J,:) = [];

%% Create output variable
test = cell2mat(raw);
%% Clear temporary variables
clearvars filename delimiter formatSpec fileID dataArray ans raw numericData col rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me J;
于 2013-10-08T11:49:45.640 回答
0

通过 读取文件fileread,将其拆分regexp并要求textscan查找所有数字:

C = regexp(fileread('transmission_data.txt'), '(\n|\r)*', 'split');
C = C(~cellfun('isempty', C));
D = cellfun(@(c) textscan(c, '%f'), C);
R = [D{:}].';

这里重要的一点是,当textscan遇到不包含数字的行时,它会返回一个空矩阵,因此当您连接结果向量时,您只会得到来自非字符串行的向量。对于您的示例,此代码返回

>> R
R =
     1     1    -1     1     1    -1    -1     1     1     1    -1     1
     1    -1     1    -1    -1     1     1     1     1    -1     1     1
     1     1     1    -1     1    -1     1    -1    -1    -1    -1     1
     1    -1     1    -1    -1    -1     1     1    -1    -1    -1     1
     1     1    -1     1     1    -1    -1    -1     1    -1     1    -1
     1    -1     1     1     1     1     1    -1    -1    -1     1    -1
于 2013-10-08T12:25:22.027 回答