0

So I'm trying to make a simple filter that will take in the dataframe and filter out all rows that don't have the target genre. It'll be easier to explain with the code:

    import pandas as pd

test = [{
        "genre":["RPG","Shooter"]},
        {"genre":["RPG"]},
        {"genre":["Shooter"]}]
        
data =pd.DataFrame(test)

fil = data.genre.isin(['RPG'])

I want the filter to return a dataframe with the following elements:

[{"genre":["RPG"]},
{"genre":["RPG", "Shooter"]}]

This is the error I'm getting when I try my code:

SystemError: <built-in method view of numpy.ndarray object at 0x00000180D1DF2760> returned a result with an error set
4

2 回答 2

1

The problem is that the elements of genre are lists, so isin does not work. Use:

mask = data['genre'].apply(frozenset(['RPG']).issubset)
print(data[mask])

Output

            genre
0  [RPG, Shooter]
1           [RPG]

The expression:

frozenset(['RPG']).issubset

Checks that any list is contained in each row, from the documentation:

Test whether every element in the set is in other.

So you could also check for multiple values easily, for example:

mask = data['genre'].apply(frozenset(['RPG', "Shooter"]).issubset)
print(data[mask])

Output

            genre
0  [RPG, Shooter]
于 2020-12-11T21:19:43.007 回答
0

You want:

data[data.genre.apply(lambda x: 'RPG' in x)]

Or:

data[data.genre.explode().eq('RPG').any(level=0)]

Output:

            genre
0  [RPG, Shooter]
1           [RPG]
于 2020-12-11T21:19:45.757 回答