0

Qualtrics is a fairly popular survey platform. You can download your survey data as CSV files. There are a couple of quirks about Qualtrics CSV files:

  1. The begin with the BOM character
  2. They include an extra row of information to explain what the variables are
  3. They frequently included parentheses and periods in column names.

I've been able to deal with #1 and #2 with the following code:

import pandas as pd
df = pd.read_csv('qualtrics_survey.csv', skiprows=[1], encoding='utf-8-sig')

I run the following code, I see a list of all columns, includeing parentheses and period.

list(df.columns.values)

There is a column called turk.1. However, I cannot run:

df.turk.1

I'm not sure what the best way is to load the files. I'd be fine removing all parenthesis, and replacing periods with dashes or something.

4

1 回答 1

4

您可以只使用df['col']符号而不是df.col选择一列。出于这个原因,这种表示法实际上是首选。

如果您不希望这样,您也可以rename在读取数据后使用该方法重命名列。您可以手动执行此操作:

df = df.rename(columns={'turk.1': 'other_name'})

或提供例如用下划线替换所有句点的功能:

df = df.rename(columns=lambda x: x.replace('.', '_'))
于 2014-04-04T19:44:29.273 回答