首先我会说我是一个 Matlab 新手,所以如果之前有人问过这个问题,但我一直找不到我需要的答案,我深表歉意。
我正在编写处理一些大型降雨雷达文件(5-20GB)的代码,我已经安排代码以 24 小时块读取数据,然后删除一些不必要的点。如下:
clear all; close all; clc;
fileID=fopen('Selkirk_15min_2008.csv');
% Open relevant Hyrad Rainfall Radar Output File.
C_Header=textscan(fileID,'%s %s %s %s %s %s %s %s',12,'delimiter',',');
% Read the file's header section - first 12 row in standard output file.
C_GridDataCheck=textscan(fileID,'%s %s %d',1230,'delimiter',',');
% Read the data coverage section of the file - number of rows eg 1230 is
% equal to the number of grid points used.
C_DataHeader=textscan(fileID,'%s %s %s %s %s %s %s %s',1,'delimiter',',');
% Read the column headers of the main data array.
formatSpec='%s %s %s %s %s %s %s %s';
N=118080;
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% Load the first 118,080 rows of data (1230 = 1 x 15 Min Timestep across
% whole catchment. 1230x4 = 4,920. 4,920x24 = 118,080. Hence this sample
% size is equal to one 24 hour period.
C_Day1=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
% Combine data arrays created by textscan into one matrix
load('SelkirkNRP.mat')
% % Load list of non-relevant grid numbers for use.
C_Day1(any(ismember(C_Day1(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
% ------------------SECOND 24 HOUR PERIOD--------------------------------
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% NB - 2000 Days between March 1st, 2008 and August 22nd, 2013
% No of Obs Per day = 118,080
% No of Obs in Dataset = 236,160,000 (rows - each row is 8 cells)
% Load the next 118,080 lines of data
C_Day2=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
%Combine data arrays created by textscan into one matrix
C_Day2(any(ismember(C_Day2(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
%------------------THIRD 24 HOUR PERIOD-----------------------------------
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% Load the next 118,080 lines of data
C_Day3=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
% Combine data arrays created by textscan into one matrix
C_Day3(any(ismember(C_Day3(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
我希望这段代码清楚地说明了我想要实现的目标。如果没有,请随时询问。
本质上,我需要的是制作这部分代码:
% ------------------SECOND 24 HOUR PERIOD--------------------------------
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% NB - 2000 Days between March 1st, 2008 and August 22nd, 2013
% No of Obs Per day = 118,080
% No of Obs in Dataset = 236,160,000 (rows - each row is 8 cells)
% Load the next 118,080 lines of data
C_Day2=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
%Combine data arrays created by textscan into one matrix
C_Day2(any(ismember(C_Day2(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
重复直到处理完整个文件。我认为这应该是大约 2000 次迭代。另外,我知道这段代码目前非常粗糙和准备就绪,并不是那么优雅,任何对新手有用的评论都将不胜感激。
希望你能提供帮助。
最好的
山姆
更新:
挖掘后,这个问题通过 For 循环得到解决和简化,示例代码如下。
clear all; close all; clc;
fileID=fopen('Selkirk_15min_2008.csv');
% Open relevant Hyrad Rainfall Radar Output File.
C_Header=textscan(fileID,'%s %s %s %s %s %s %s %s',12,'delimiter',',');
% Read the file's header section - first 12 row in standard output file.
C_GridDataCheck=textscan(fileID,'%s %s %d',1230,'delimiter',',');
% Read the data coverage section of the file - number of rows eg 1230 is
% equal to the number of grid points used.
C_DataHeader=textscan(fileID,'%s %s %s %s %s %s %s %s',1,'delimiter',',');
% Read the column headers of the main data array.
formatSpec='%s %f %f %f %f %s %s %f';
N=118080;
load('SelkirkNRP.mat')
% Load list of non-relevant grid numbers for use.
for i=1:1
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% Load the first 118,080 rows of data (1230 = 1 x 15 Min Timestep across
% whole catchment. 1230x4 = 4,920. 4,920x24 = 118,080. Hence this sample
% size is equal to one 24 hour period.
C_Data=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
% Combine data arrays created by textscan into one matrix
C_Data(any(ismember(C_Data(:,1),SelkirkNRP),2),:)=[];
% Check first column against list of non-relevant grid points and remove
% all non-relevant rows
C_DataMatrix=str2double(C_Data);
% Convert Cell Array to Matrix for writing to CSV
csvwrite(['Day ' num2str(i) ' Data'], C_DataMatrix)
%Write to CSV
end