0

首先我会说我是一个 Matlab 新手,所以如果之前有人问过这个问题,但我一直找不到我需要的答案,我深表歉意。

我正在编写处理一些大型降雨雷达文件(5-20​​GB)的代码,我已经安排代码以 24 小时块读取数据,然后删除一些不必要的点。如下:

clear all; close all; clc;

fileID=fopen('Selkirk_15min_2008.csv');

% Open relevant Hyrad Rainfall Radar Output File.

C_Header=textscan(fileID,'%s %s %s %s %s %s %s %s',12,'delimiter',',');

% Read the file's header section - first 12 row in standard output file.

C_GridDataCheck=textscan(fileID,'%s %s %d',1230,'delimiter',',');

% Read the data coverage section of the file - number of rows eg 1230 is
% equal to the number of grid points used.

C_DataHeader=textscan(fileID,'%s %s %s %s %s %s %s %s',1,'delimiter',',');

% Read the column headers of the main data array.

formatSpec='%s %s %s %s %s %s %s %s';
N=118080;
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');

% Load the first 118,080 rows of data (1230 = 1 x 15 Min Timestep across
% whole catchment. 1230x4 = 4,920. 4,920x24 = 118,080. Hence this sample
% size is equal to one 24 hour period. 

C_Day1=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});

% Combine data arrays created by textscan into one matrix

load('SelkirkNRP.mat')

% % Load list of non-relevant grid numbers for use.

C_Day1(any(ismember(C_Day1(:,1),SelkirkNRP),2),:)=[];

% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows

disp('24 Hour period loaded and non-relevant grid points removed')

% ------------------SECOND 24 HOUR PERIOD--------------------------------

C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% NB - 2000 Days between March 1st, 2008 and August 22nd, 2013
% No of Obs Per day = 118,080
% No of Obs in Dataset = 236,160,000 (rows - each row is 8 cells)
% Load the next 118,080 lines of data 

C_Day2=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});

%Combine data arrays created by textscan into one matrix

C_Day2(any(ismember(C_Day2(:,1),SelkirkNRP),2),:)=[];

% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows

disp('24 Hour period loaded and non-relevant grid points removed')

%------------------THIRD 24 HOUR PERIOD-----------------------------------

C_Read=textscan(fileID,formatSpec,N,'delimiter',',');

% Load the next 118,080 lines of data 

C_Day3=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});

% Combine data arrays created by textscan into one matrix

C_Day3(any(ismember(C_Day3(:,1),SelkirkNRP),2),:)=[];

% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows

disp('24 Hour period loaded and non-relevant grid points removed')

我希望这段代码清楚地说明了我想要实现的目标。如果没有,请随时询问。

本质上,我需要的是制作这部分代码:

% ------------------SECOND 24 HOUR PERIOD--------------------------------

C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% NB - 2000 Days between March 1st, 2008 and August 22nd, 2013
% No of Obs Per day = 118,080
% No of Obs in Dataset = 236,160,000 (rows - each row is 8 cells)
% Load the next 118,080 lines of data 

C_Day2=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});

%Combine data arrays created by textscan into one matrix

C_Day2(any(ismember(C_Day2(:,1),SelkirkNRP),2),:)=[];

% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows

disp('24 Hour period loaded and non-relevant grid points removed')

重复直到处理完整个文件。我认为这应该是大约 2000 次迭代。另外,我知道这段代码目前非常粗糙和准备就绪,并不是那么优雅,任何对新手有用的评论都将不胜感激。

希望你能提供帮助。

最好的

山姆

更新:

挖掘后,这个问题通过 For 循环得到解决和简化,示例代码如下。

clear all; close all; clc;

fileID=fopen('Selkirk_15min_2008.csv');

% Open relevant Hyrad Rainfall Radar Output File.

C_Header=textscan(fileID,'%s %s %s %s %s %s %s %s',12,'delimiter',',');

% Read the file's header section - first 12 row in standard output file.

C_GridDataCheck=textscan(fileID,'%s %s %d',1230,'delimiter',',');

% Read the data coverage section of the file - number of rows eg 1230 is
% equal to the number of grid points used.

C_DataHeader=textscan(fileID,'%s %s %s %s %s %s %s %s',1,'delimiter',',');

% Read the column headers of the main data array.

formatSpec='%s %f %f %f %f %s %s %f';
N=118080;

load('SelkirkNRP.mat')

% Load list of non-relevant grid numbers for use.

for i=1:1

C_Read=textscan(fileID,formatSpec,N,'delimiter',',');

% Load the first 118,080 rows of data (1230 = 1 x 15 Min Timestep across
% whole catchment. 1230x4 = 4,920. 4,920x24 = 118,080. Hence this sample
% size is equal to one 24 hour period. 

C_Data=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});

% Combine data arrays created by textscan into one matrix

C_Data(any(ismember(C_Data(:,1),SelkirkNRP),2),:)=[];

% Check first column against list of non-relevant grid points and remove
% all non-relevant rows

C_DataMatrix=str2double(C_Data);

% Convert Cell Array to Matrix for writing to CSV

csvwrite(['Day ' num2str(i) ' Data'], C_DataMatrix)

%Write to CSV

end
4

0 回答 0