I'm trying to apply the shapiro-wilk test to my dataframe, which is split into groups based on two categorical variables:
df.groupby(['category 1', 'category 2']).apply(stats.shapiro)
This results in an error saying that it couldn't convert string to float. The only non-numeric columns in there are the two categories which I'm using to split the dataframe.
How do I fix it?
EDIT:
example data:
cat1 cat2 purchases sales
A B 20 25
C A 30 45
B B 35 20
A A 40 50
I want to get the shapiro statistic and a p value for each of the numeric columns without having to write all possible combinations of each category.