My problem is as follows:
I am given a number of chi-squared values for the same collection of data sets, fitted with different models. (so, for example, for 5 collections of points, fitted with either a single binomial distribution, or both binomial and normal distributions, I would have 10 chi-squared values).
I would like to use machine learning categorization to categorize the data sets into "models":
e.g. data sets (1,2,5 and 7) are best fitted using only binomial distributions, whereas sets (3,4,6,8,9,10) - using normal distribution as well.
Notably, the number of degrees of freedom is likely to be different for both chi-squared distributions and is always known, as is the number of models.
My (probably) naive guess for a solution would be as follows:
Randomly distribute the points (10 chi-squared values in this case) into the number of categories (2).
Fit each of the categories using the particular chi-squared distributions (in this case with different numbers of degrees of freedom)
Move outlying points from one distribution to the next.
Repeat steps 2 and 3 until happy with result.
However I don't know how I would select the outlying points, or, for that matter, if there already is an algorithm that does it.
I am extremely new to machine learning and fairly new to statistics, so any relevant keywords would be appreciated too.