0

I need help writing a for loop to add number of times an element appears in a dataset to the value of a dictionary comprehension.

Here is the sample dataset:

salary_data = 
{'Age': '39', 'Education': 'E - Bachelors', 'Occupation': 'Adm-clerical', 'Relationship': 'Not-in-family', 'Race': 'White', 'Sex': 'Male', 'Target': '<=50K'}
{'Age': '50', 'Education': 'E - Bachelors', 'Occupation': 'Exec-managerial', 'Relationship': 'Husband', 'Race': 'White', 'Sex': 'Male', 'Target': '<=50K'}
{'Age': '38', 'Education': 'B - HS Diploma', 'Occupation': 'Handlers-cleaners', 'Relationship': 'Not-in-family', 'Race': 'White', 'Sex': 'Male', 'Target': '<=50K'}
{'Age': '53', 'Education': 'A - No HS Diploma', 'Occupation': 'Handlers-cleaners', 'Relationship': 'Husband', 'Race': 'Black', 'Sex': 'Male', 'Target': '<=50K'}
{'Age': '28', 'Education': 'E - Bachelors', 'Occupation': 'Prof-specialty', 'Relationship': 'Wife', 'Race': 'Black', 'Sex': 'Female', 'Target': '<=50K'}
{'Age': '37', 'Education': 'F - Graduate Degree', 'Occupation': 'Exec-managerial', 'Relationship': 'Wife', 'Race': 'White', 'Sex': 'Female', 'Target': '<=50K'}
{'Age': '49', 'Education': 'A - No HS Diploma', 'Occupation': 'Other-service', 'Relationship': 'Not-in-family', 'Race': 'Black', 'Sex': 'Female', 'Target': '<=50K'}
{'Age': '52', 'Education': 'B - HS Diploma', 'Occupation': 'Exec-managerial', 'Relationship': 'Husband', 'Race': 'White', 'Sex': 'Male', 'Target': '>50K'}
{'Age': '31', 'Education': 'F - Graduate Degree', 'Occupation': 'Prof-specialty', 'Relationship': 'Not-in-family', 'Race': 'White', 'Sex': 'Female', 'Target': '>50K'}
{'Age': '42', 'Education': 'E - Bachelors', 'Occupation': 'Exec-managerial', 'Relationship': 'Husband', 'Race': 'White', 'Sex': 'Male', 'Target': '>50K'}

and a list of unique education levels was given:

unique_education_levels=
['A - No HS Diploma',
 'B - HS Diploma',
 'C - Some College',
 'D - Associates',
 'E - Bachelors',
 'F - Graduate Degree']

I need to create a dictionary called education_level_frequencies where the keys are the unique education levels and the values are the number of times the education level appears in the dataset.

So far I used a dictionary comprehension to create the dictionary with values of 0.

education_level_frequencies = [{level: 0} for level in unique_education_levels]

I'm trying to use a for loop to iterate through the dataset and add +1 to the education_level_frequencies keys to no avail.

for entry in salary_data:
    if entry['Education'] == education_level_frequencies:
        education_level_frequencies[entry] += 1
4

3 回答 3

0

使用 for 循环,您可能想要写的是:

for entry in salary_data:
    if entry['Education'] in education_level_frequencies:
        education_level_frequencies[entry['Education'] += 1
于 2021-12-02T06:17:36.433 回答
0

看起来unique_education_levels是多余的,因为字典中的键必须是唯一的。

您可以使用 collections.Countercollections.defaultdict

from collections import Counter, defaultdict

salary_data = [
    {'Age': '39', 'Education': 'E - Bachelors', 'Occupation': 'Adm-clerical', 'Relationship': 'Not-in-family', 'Race': 'White', 'Sex': 'Male', 'Target': '<=50K'},
    {'Age': '50', 'Education': 'E - Bachelors', 'Occupation': 'Exec-managerial', 'Relationship': 'Husband', 'Race': 'White', 'Sex': 'Male', 'Target': '<=50K'},
    {'Age': '38', 'Education': 'B - HS Diploma', 'Occupation': 'Handlers-cleaners', 'Relationship': 'Not-in-family', 'Race': 'White', 'Sex': 'Male', 'Target': '<=50K'},
    {'Age': '53', 'Education': 'A - No HS Diploma', 'Occupation': 'Handlers-cleaners', 'Relationship': 'Husband', 'Race': 'Black', 'Sex': 'Male', 'Target': '<=50K'},
    {'Age': '28', 'Education': 'E - Bachelors', 'Occupation': 'Prof-specialty', 'Relationship': 'Wife', 'Race': 'Black', 'Sex': 'Female', 'Target': '<=50K'},
    {'Age': '37', 'Education': 'F - Graduate Degree', 'Occupation': 'Exec-managerial', 'Relationship': 'Wife', 'Race': 'White', 'Sex': 'Female', 'Target': '<=50K'},
    {'Age': '49', 'Education': 'A - No HS Diploma', 'Occupation': 'Other-service', 'Relationship': 'Not-in-family', 'Race': 'Black', 'Sex': 'Female', 'Target': '<=50K'},
    {'Age': '52', 'Education': 'B - HS Diploma', 'Occupation': 'Exec-managerial', 'Relationship': 'Husband', 'Race': 'White', 'Sex': 'Male', 'Target': '>50K'},
    {'Age': '31', 'Education': 'F - Graduate Degree', 'Occupation': 'Prof-specialty', 'Relationship': 'Not-in-family', 'Race': 'White', 'Sex': 'Female', 'Target': '>50K'},
    {'Age': '42', 'Education': 'E - Bachelors', 'Occupation': 'Exec-managerial', 'Relationship': 'Husband', 'Race': 'White', 'Sex': 'Male', 'Target': '>50K'},
]

education_level_frequencies = Counter() # or defaultdict(int)
for entry in salary_data:
    education_level_frequencies[entry['Education']] += 1
education_level_frequencies = dict(education_level_frequencies)

# Equivalent one liner to above:
# education_level_frequencies = dict(Counter(entry['Education'] for entry in salary_data))

print(education_level_frequencies)

或者,get()如果您只想使用标准 python 字典,也可以使用该方法:

education_level_frequencies = {}
for entry in salary_data:
    education_val = entry['Education']
    education_level_frequencies[education_val] = education_level_frequencies.get(
            education_val, 0) + 1

print(education_level_frequencies)

输出:

{'E - Bachelors': 4, 'B - HS Diploma': 2, 'A - No HS Diploma': 2, 'F - Graduate Degree': 2}
于 2021-12-02T06:24:50.673 回答
0

你可以做这样的事情。

education_level_frequencies = defaultdict(int)
for data in salary_data:
    education = data['Education']
    education_level_frequencies[education] += int(education in unique_education_levels)

在这里,我们将获得教育水平的所有频率。

于 2021-12-02T07:13:08.743 回答