0

我正在尝试读取如下所示的文件:

Data Sampling Rate: 256 Hz
*************************

Channels in EDF Files:
**********************
Channel 1: FP1-F7
Channel 2: F7-T7
Channel 3: T7-P7
Channel 4: P7-O1

File Name: chb01_02.edf

File Start Time: 12:42:57

File End Time: 13:42:57 

Number of Seizures in File: 0

File Name: chb01_03.edf

File Start Time: 13:43:04

File End Time: 14:43:04

Number of Seizures in File: 1

Seizure Start Time: 2996 seconds

Seizure End Time: 3036 seconds

到目前为止,我有这个代码:

fid1= fopen('chb01-summary.txt')
data=struct('id',{},'stime',{},'etime',{},'seizenum',{},'sseize',{},'eseize',{});
if fid1 ==-1
    error('File cannot be opened ')
end
tline= fgetl(fid1);
while ischar(tline)
    i=1;
    disp(tline);
end

我想用它regexp来查找表达式,所以我做了:

line1 = '(.*\d{2} (\.edf)' 
data{1} = regexp(tline, line1);
tline=fgetl(fid1);
time = '^Time: .*\d{2]}: \d{2} :\d{2}' ;
data{2}= regexp(tline,time);
tline=getl(fid1);
seizure = '^File: .*\d';
data{4}= regexp(tline,seizure);
if data{4}>0
    stime = '^Time: .*\d{5}'; 
    tline=getl(fid1);
    data{5}= regexp(tline,seizure);
    tline= getl(fid1);
    data{6}= regexp(tline,seizure);
end

我尝试使用循环来查找文件名开头的行:

for (firstline<1) || (firstline>1 )
    firstline= strfind(tline, 'File Name') 
    tline=fgetl(fid1);
end 

现在我很难过。

假设我在信息所在的行,我如何存储信息regexpdata运行代码一次后,我得到了一个空数组...

提前致谢。

4

1 回答 1

4

I find it the easiest to read the lines into a cell array first using textscan:

%// Read lines as strings
fid = fopen('input.txt', 'r');
C = textscan(fid, '%s', 'Delimiter', '\n');
fclose(fid);

and then apply regexp on it to do the rest of the manipulations:

%// Parse field names and values
C = regexp(C{:}, '^\s*([^:]+)\s*:\s*(.+)\s*', 'tokens');
C = [C{:}];                          %// Flatten the cell array
C = reshape([C{:}], 2, []);          %// Reshape into name-value pairs

Now you have a cell array C of field names and their corresponding (string) values, and all you have to do is plug it into struct in the correct syntax (using a comma-separated list in this case). Note that the field names have spaces in them, so this needs to be taken care of before they can be used (e.g replace them with underscores):

C(1, :) = strrep(C(1, :), ' ', '_'); %// Replace spaces with underscores
data = struct(C{:});

Here's what I get for your input file:

data =

            Data_Sampling_Rate: '256 Hz'
                     Channel_1: 'FP1-F7'
                     Channel_2: 'F7-T7'
                     Channel_3: 'T7-P7'
                     Channel_4: 'P7-O1'
                     File_Name: 'chb01_03.edf'
               File_Start_Time: '13:43:04'
                 File_End_Time: '14:43:04'
    Number_of_Seizures_in_File: '1'
            Seizure_Start_Time: '2996 seconds'
              Seizure_End_Time: '3036 seconds'

Of course, it is possible to prettify it even more by converting all relevant numbers to numerical values, grouping the 'channel' fields together and such, but I'll leave this to you. Good luck!

于 2013-07-23T19:18:03.070 回答