2

我正在尝试使用 mne-python 的 'visual_92_categories' 数据集,但是当我想要过滤和提取时期时,出现内存错误!我的内存是7G。我想知道是否有人可以帮助我。python 或 jupyter notebook 有内存限制吗?谢谢

data_path = visual_92_categories.data_path()    
# Define stimulus - trigger mapping
fname = op.join(data_path, 'visual_stimuli.csv')
conds = read_csv(fname)
max_trigger = 92
conds = conds[:max_trigger]  
conditions = []
for c in conds.values:
    cond_tags = list(c[:2])
    cond_tags += [('not-' if i == 0 else '') + conds.columns[k]
                  for k, i in enumerate(c[2:], 2)]
    conditions.append('/'.join(map(str, cond_tags)))
print(conditions[24])
event_id = dict(zip(conditions, conds.trigger + 1))
n_runs = 4  # 4 for full data (use less to speed up computations)
fname = op.join(data_path, 'sample_subject_%i_tsss_mc.fif')
raws = [read_raw_fif(fname % block) for block in range(n_runs)]
raw = concatenate_raws(raws)    
events = mne.find_events(raw, min_duration=.002)    
events = events[events[:, 2] <= max_trigger]       
picks = mne.pick_types(raw.info, meg=True)
epochs = mne.Epochs(raw, events=events, event_id=event_id, baseline=None,
                    picks=picks, tmin=-.1, tmax=.500, preload=True)
y = epochs.events[:, 2]           
X1 = epochs.copy().get_data()
4

1 回答 1

0

执行此代码对我来说需要超过 7Gb 的内存。甚至X1阵列也约为 4Gb。但它的类型是float64,所以如果你无法获得更多内存,请尝试将其保存为float32(内存消耗将减半)。在大多数情况下,准确度下降是可以接受的。

您也可以尝试逐块处理数据,将其作为 numpy.array 保存到磁盘,完成后,上传并连接数组:

# leaving initial part intact
import pickle  # need to save a data

for block in range(n_runs):
    raw = mne.io.read_raw_fif(fname % block)
    # raw = concatenate_raws(raws)
    events = mne.find_events(raw, min_duration=.002)
    events = events[events[:, 2] <= max_trigger]
    picks = mne.pick_types(raw.info, meg=True)
    try:
        epochs = mne.Epochs(raw, events=events, event_id=event_id, base
    line=None,
    picks=picks, tmin=-.1, tmax=.500, preload=True)
    except ValueError:  # there's no correct data in some blocks, catch exception
        continue
       y = epochs.events[:, 2].astype('float32')
       X1 = epochs.copy().get_data().astype('float32')
       pickle.dump(y, open('y_block_{}.pkl'.format(block), 'wb'))  # use convenient names 
       pickle.dump(X1, open('x_block_{}.pkl'.format(block), 'wb'))

# remove unnecessary objects from memory
del y
del X1
del raw
del epochs

X1 = None  # strore x_arrays
y = None  # sore y_s
for block in range(n_runs):
    try:
        if X1 is None:
            X1 = pickle.load(open('x_block_{}.pkl'.format(block), 'rb'))
            y = pickle.load(open('y_block_{}.pkl'.format(block), 'rb'))
        else:
            X1 = np.concatenate((X1, pickle.load(open('x_block_{}.pkl'.format(block), 'rb'))))
             y = np.concatenate((y, pickle.load(open('y_block_{}.pkl'.format(block), 'rb'))))
     except FileNotFoundError:  # if no such block from the previous stage
         pass

因此,此代码适用于我,不会耗尽内存(即 < 7 Gb),但我不确定这些mne代码是否独立处理所有块并且它是等效代码。至少这段代码创建了一个没有 ~0.5% 样本的数组。比我更有经验的人mne可能会解决它。

于 2018-09-21T11:59:18.513 回答