我有来自眼球追踪的数据(.edf 文件 - 来自 SR-research 的 Eyelink)。我想对其进行分析并获得各种测量值,例如注视、扫视、持续时间等。是否有现有的软件包来分析眼动追踪数据?谢谢!
3 回答
至少要将 .edf 文件导入 pandas DF,您可以使用 Niklas Wilming 提供的以下软件包:https
://github.com/nwilming/pyedfread/tree/master/pyedfread
这应该已经处理了扫视和注视- 看看自述文件。一旦它们在数据框中,您就可以对其应用任何您想要的分析。
pyeparse似乎是另一个可用于 eyelink 数据分析的库(但目前似乎未维护)。
以下是他们示例的简短摘录:
import numpy as np
import matplotlib.pyplot as plt
import pyeparse as pp
fname = '../pyeparse/tests/data/test_raw.edf'
raw = pp.read_raw(fname)
# visualize initial calibration
raw.plot_calibration(title='5-Point Calibration')
# create heatmap
raw.plot_heatmap(start=3., stop=60.)
编辑:在我发布我的答案后,我发现了一个很好的列表,其中包含许多用于 eyelink edf 数据分析的潜在工具:https ://github.com/davebraze/FDBeye/wiki/Researcher-Contributed-Eye-Tracking-Tools
嘿,这个问题似乎很老,但也许我可以重新激活它,因为我目前面临同样的情况。首先,我建议您将 .edf 转换为 .asc 文件。这样更容易阅读以获得第一印象。为此,存在许多工具,但我使用了 SR-Research Eyelink Developers Kit(此处)。
我不知道您的设置,但 Eyelink 1000 本身可以检测扫视和注视。我在 .asc 文件中的情况如下所示:
SFIX L 10350642
10350642 864.3 542.7 2317.0
...
...
10350962 863.2 540.4 2354.0
EFIX L 10350642 10350962 322 863.1 541.2 2339
SSACC L 10350964
10350964 863.4 539.8 2359.0
...
...
10351004 683.4 511.2 2363.0
ESACC L 10350964 10351004 42 863.4 539.8 683.4 511.2 5.79 221
第一个数字对应于时间戳,第二个和第三个对应于 xy 坐标,最后一个是你的瞳孔直径(ESACC 之后的最后一个数字是什么,我不知道)。
SFIX -> start fixation
EFIX -> end fixation
SSACC -> start saccade
ESACC -> end saccade
您也可以查看 PyGaze,我没有使用它,但是搜索一个工具箱,这个总是弹出。
编辑我在这里 找到了这个工具箱。它看起来很酷,并且适用于示例数据,但遗憾的是不适用于我的
编辑第 2 号 在处理我自己的眼动追踪数据后重新审视这个问题,我想我可能会分享一个编写的函数来处理我的数据:
def eyedata2pandasframe(directory):
'''
This function takes a directory from which it tries to read in ASCII files containing eyetracking data
It returns eye_data: A pandas dataframe containing data from fixations AND saccades fix_data: A pandas dataframe containing only data from fixations
sac_data: pandas dataframe containing only data from saccades
fixation: numpy array containing information about fixation onsets and offsets
saccades: numpy array containing information about saccade onsets and offsets
blinks: numpy array containing information about blink onsets and offsets
trials: numpy array containing information about trial onsets
'''
eye_data= []
fix_data = []
sac_data = []
data_header = {0: 'TimeStamp',1: 'X_Coord',2: 'Y_Coord',3: 'Diameter'}
event_header = {0: 'Start', 1: 'End'}
start_reading = False
in_blink = False
in_saccade = False
fix_timestamps = []
sac_timestamps = []
blink_timestamps = []
trials = []
sample_rate_info = []
sample_rate = 0
# read the file and store, depending on the messages the data
# we have the following structure:
# a header -- every line starts with a '**'
# a bunch of messages containing information about callibration/validation and so on all starting with 'MSG'
# followed by:
# START 10350638 LEFT SAMPLES EVENTS
# PRESCALER 1
# VPRESCALER 1
# PUPIL AREA
# EVENTS GAZE LEFT RATE 500.00 TRACKING CR FILTER 2
# SAMPLES GAZE LEFT RATE 500.00 TRACKING CR FILTER 2
# followed by the actual data:
# normal data --> [TIMESTAMP]\t [X-Coords]\t [Y-Coords]\t [Diameter]
# Start of EVENTS [BLINKS FIXATION SACCADES] --> S[EVENTNAME] [EYE] [TIMESTAMP]
# End of EVENTS --> E[EVENT] [EYE] [TIMESTAMP_START]\t [TIMESTAMP_END]\t [TIME OF EVENT]\t [X-Coords start]\t [Y-Coords start]\t [X_Coords end]\t [Y-Coords end]\t [?]\t [?]
# Trial messages --> MSG timestamp\t TRIAL [TRIALNUMBER]
try:
with open(directory) as f:
csv_reader = csv.reader(f, delimiter ='\t')
for i, row in enumerate (csv_reader):
if any ('RATE' in item for item in row):
sample_rate_info = row
if any('SYNCTIME' in item for item in row): # only start reading after this message
start_reading = True
elif any('SFIX' in item for item in row): pass
#fix_timestamps[0].append (row)
elif any('EFIX' in item for item in row):
fix_timestamps.append ([row[0].split(' ')[4],row[1]])
#fix_timestamps[1].append (row)
elif any('SSACC' in item for item in row):
#sac_timestamps[0].append (row)
in_saccade = True
elif any('ESACC' in item for item in row):
sac_timestamps.append ([row[0].split(' ')[3],row[1]])
in_saccade = False
elif any('SBLINK' in item for item in row): # stop reading here because the blinks contain NaN
# blink_timestamps[0].append (row)
in_blink = True
elif any('EBLINK' in item for item in row): # start reading again. the blink ended
blink_timestamps.append ([row[0].split(' ')[2],row[1]])
in_blink = False
elif any('TRIAL' in item for item in row):
# the first element is 'MSG', we don't need it, then we split the second element to seperate the timestamp and only keep it as an integer
trials.append (int(row[1].split(' ')[0]))
elif start_reading and not in_blink:
eye_data.append(row)
if in_saccade:
sac_data.append(row)
else:
fix_data.append(row)
# drop the last data point, because it is the 'END' message
eye_data.pop(-1)
sac_data.pop(-1)
fix_data.pop(-1)
# convert every item in list into a float, substract the start of the first trial to set the start of the first video to t0=0
# then devide by 1000 to convert from milliseconds to seconds
for row in eye_data:
for i, item in enumerate (row):
row[i] = float (item)
for row in fix_data:
for i, item in enumerate (row):
row[i] = float (item)
for row in sac_data:
for i, item in enumerate (row):
row[i] = float (item)
for row in fix_timestamps:
for i, item in enumerate (row):
row [i] = (float(item)-trials[0])/1000
for row in sac_timestamps:
for i, item in enumerate (row):
row [i] = (float(item)-trials[0])/1000
for row in blink_timestamps:
for i, item in enumerate (row):
row [i] = (float(item)-trials[0])/1000
sample_rate = float (sample_rate_info[4])
# convert into pandas fix_data Frames for a better overview
eye_data = pd.DataFrame(eye_data)
fix_data = pd.DataFrame(fix_data)
sac_data = pd.DataFrame(sac_data)
fix_timestamps = pd.DataFrame(fix_timestamps)
sac_timestamps = pd.DataFrame(sac_timestamps)
trials = np.array(trials)
blink_timestamps = pd.DataFrame(blink_timestamps)
# rename header for an even better overview
eye_data = eye_data.rename(columns=data_header)
fix_data = fix_data.rename(columns=data_header)
sac_data = sac_data.rename(columns=data_header)
fix_timestamps = fix_timestamps.rename(columns=event_header)
sac_timestamps = sac_timestamps.rename(columns=event_header)
blink_timestamps = blink_timestamps.rename(columns=event_header)
# substract the first timestamp of trials to set the start of the first video to t0=0
eye_data.TimeStamp -= trials[0]
fix_data.TimeStamp -= trials[0]
sac_data.TimeStamp -= trials[0]
trials -= trials[0]
trials = trials /1000 # does not work with trials/=1000
# devide TimeStamp to get time in seconds
eye_data.TimeStamp /=1000
fix_data.TimeStamp /=1000
sac_data.TimeStamp /=1000
return eye_data, fix_data, sac_data, fix_timestamps, sac_timestamps, blink_timestamps, trials, sample_rate
except:
print ('Could not read ' + str(directory) + ' properly!!! Returned empty data')
return eye_data, fix_data, sac_data, fix_timestamps, sac_timestamps, blink_timestamps, trials, sample_rate
希望对大家有帮助。您可能需要更改代码的某些部分,例如拆分字符串以获取有关事件开启/偏移的重要信息的索引。或者您不想将时间戳转换为秒,或者不想将第一次试用的开始时间设置为 0。这取决于您。此外,在我的数据中,我们发送了一条消息以了解我们何时开始测量('SYNCTIME'),而我的实验中只有一个条件,所以只有一个'TRIAL'消息
干杯