4

I'm trying to implement an algorithm to get all combinations of k elements out of a set of n elements where the difference between two consecutive combinations are maximized (so kind of reverse Gray codes). In other words, the combinations should be ordered to avoid elements from appearing twice in a row, and so that no element is unnecessarily discriminated.

Ideally, the algorithm would also NOT pre-calculate all combinations and store them into memory, but rather deliver combinations on demand. I have searched extensively for this and found a few detailed answers such as https://stackoverflow.com/a/127856/1226020, but I can't seem to apply this. Also, many of the articles linked in that answer are paid content.

To illustrate what I mean:

From a set of [0, 1, 2, 3, 4], find all combinations of two elements. Using a simple algorithm that tries to increment the right-most element until no longer possible, then moving left, incrementing the previous digit etc, I get the following results:

[0, 1]
[0, 2]
[0, 3]
[0, 4]
[1, 2]
[1, 3]
[1, 4]
[2, 3]
[2, 4]
[3, 4]

I produce this result with the following Java code:

public class CombinationGenerator {
    private final int mNrElements;
    private final int[] mCurrentCombination;

    public CombinationGenerator(int n, int k) {
        mNrElements = n;
        mCurrentCombination = new int[k];

        initElements(0, 0);
        // fake initial state in order not to miss first combination below
        mCurrentCombination[mCurrentCombination.length - 1]--;
    }

    private void initElements(int startPos, int startValue) {
        for (int i = startPos; i < mCurrentCombination.length; i++) {
            mCurrentCombination[i] = i + startValue - startPos;
        }
    }

    public int[] getNextCombination() {
        for (int i = 0; i < mCurrentCombination.length; i++) {
            int pos = mCurrentCombination.length - 1 - i;

            if (mCurrentCombination[pos] < mNrElements - 1 - i) {
                initElements(pos, mCurrentCombination[pos] + 1);
                return mCurrentCombination;
            }
        }

        return null;
    }

    public static void main(String[] args) {
        CombinationGenerator cg = new CombinationGenerator(5, 2);
        int[] c;

        while ((c = cg.getNextCombination()) != null) {
            System.out.println(Arrays.toString(c));
        }
    }

}

This is not what I want, because I want each consecutive combination to be as different as possible from the previous one. Currently, element "1" appears four times in a row, and then never again. For this particular example, one solution would be:

[0, 1]
[2, 3]
[0, 4]
[1, 2]
[3, 4]
[0, 2]
[1, 3]
[2, 4]
[0, 3]
[1, 4]

I have indeed managed to accomplish this result for this particular (7,4) case by applying a sorting algoritm after the combinations are generated, but this does not fulfill my requirement of on-demand combination generation as the entire set of combinations has to be generated at once, then sorted and kept in memory. I'm not sure it works for arbitrary k and n values either. And finally, I'm pretty sure it's not the most efficient way as the sorting algorithm basically loops through the set of combinations trying to find one sharing no elements with the previous combination. I also considered keeping a table of "hit counts" for each element and use that to always get the next combination containing the lowest combined hit count. My somewhat empirical conclusion is that it will be possible to avoid elements from completely appearing in two consecutive combinations if n > 2k. Otherwise, it should at least be possible to avoid elements from appearing more than twice in a row etc.

You could compare this problem to what is achieved for k = 2 using a standard round-robin scheme for football games etc, but I need a solution for arbitrary values of k. We can imagine this to be a tournament of some sort of game where we have n players that are to play against all other players in a set of games, each game holding k players. Players should as far as possible not have to play two games in a row, but should also not have to wait unnecessarily long between two game appearances.

Any pointers on how to solve this either with a reliable sorting algorithm post generation, or - preferably - on-demand, would be awesome!

Note: Let's typically assume that n <= 50, k <= 5

Thanks

4

2 回答 2

2

根据@DavidEisenstat 的建议工作的快速而肮脏的工作代码:

public static void main(String[] args) {
    ArrayList<int[]> all = new ArrayList<int[]>();
    // output is 0 if distance(i, j) != max, and 1 otherwise
    int[][] m = buildGraph(7, 4, all);
    HamiltonianCycle hc = new HamiltonianCycle();
    int path[] = hc.findHamiltonianCycle(m);
    if (path != null) {
        // I have no proof that such a path will always exist
        for (int i : path) {
            System.out.println(Arrays.toString(all.get(i)));
        }
    }
}

上述代码的输出(7,4);距离(长度 - size_of_intersection)始终为 3;尝试使用 4 会导致图表断开:

    [0, 1, 2, 3]
    [0, 4, 5, 6]
    [1, 2, 3, 4]
    [0, 1, 5, 6]
    [0, 2, 3, 4]
    [1, 2, 5, 6]
    [0, 1, 3, 4]
    [0, 2, 5, 6]
    [1, 3, 4, 5]
    [0, 1, 2, 6]
    [0, 3, 4, 5]
    [1, 2, 3, 6]
    [0, 1, 4, 5]
    [0, 2, 3, 6]
    [1, 4, 5, 6]
    [0, 2, 3, 5]
    [1, 2, 4, 6]
    [0, 3, 5, 6]
    [1, 2, 4, 5]
    [0, 3, 4, 6]
    [1, 2, 3, 5]
    [0, 2, 4, 6]
    [1, 3, 5, 6]
    [0, 2, 4, 5]
    [1, 3, 4, 6]
    [0, 1, 2, 5]
    [2, 3, 4, 6]
    [0, 1, 3, 5]
    [2, 4, 5, 6]
    [0, 1, 3, 6]
    [2, 3, 4, 5]
    [0, 1, 4, 6]
    [2, 3, 5, 6]
    [0, 1, 2, 4]
    [3, 4, 5, 6]

缺少代码位:

// uses JHH's code to build sequences, stores it in 'all'
public static int[][] buildGraph(int n, int k, ArrayList<int[]> all) {
    SequenceGenerator sg = new SequenceGenerator(n, k);
    int[] c;
    while ((c = sg.getNextCombination()) != null) {
        all.add(c.clone());         
    }
    int best = Math.min(n-k, k);
    System.out.println("Best is " + best);
    int matrix[][] = new int[all.size()][];
    for (int i=0; i<matrix.length; i++) {
        matrix[i] = new int[all.size()];
        for (int j=0; j<i; j++) {
            int d = distance(all.get(j), all.get(i));
            matrix[i][j] = matrix[j][i] = (d != best)? 0 : 1;
        }           
    }
    return matrix;
}

距离:(根本没有效率,但与汉密尔顿计算的成本相比相形见绌)

public static int distance(int[] a, int[] b) {
        HashSet<Integer> ha = new HashSet<Integer>();
        HashSet<Integer> hb = new HashSet<Integer>();
        for (int i=0; i<a.length; i++) {
                ha.add(a[i]);
                hb.add(b[i]);
        }
        ha.retainAll(hb);
        return a.length - ha.size();
}

为了找到汉密尔顿,我从http://www.sanfoundry.com/java-program-find-hamiltonian-cycle-unweighted-graph/修改了代码:

import java.util.Arrays;

public class HamiltonianCycle {

    private int V, pathCount;
    private int[] path;
    private int[] answer;
    private int[][] graph;

    public int[] findHamiltonianCycle(int[][] g) {
        V = g.length;
        path = new int[V];

        Arrays.fill(path, -1);
        graph = g;
        path[0] = 0;
        pathCount = 1;
        if (solve(0)) {
            return path;
        } else {
            return null;
        }
    }

    public boolean solve(int vertex) {
        if (graph[vertex][0] == 1 && pathCount == V) {
            return true;
        }
        if (pathCount == V) {
            return false;
        }

        for (int v = 0; v < V; v++) {
            if (graph[vertex][v] == 1) {
                path[pathCount++] = v;
                graph[vertex][v] = 0;
                graph[v][vertex] = 0;

                if (!isPresent(v)) {
                    if (solve(v)) {
                        answer = path.clone();
                        return true;
                    }
                }

                graph[vertex][v] = 1;
                graph[v][vertex] = 1;
                path[--pathCount] = -1;
            }
        }
        return false;
    }

    public boolean isPresent(int v) {
        for (int i = 0; i < pathCount - 1; i++) {
            if (path[i] == v) {
                return true;
            }
        }
        return false;
    }
}

请注意:对于大量组合,这将非常慢......

于 2015-03-10T15:36:29.743 回答
1

虽然我非常感谢 @tucuxi 和 @David Eisenstadt 的努力,但我发现哈密顿方法存在一些问题,即它不能解决 n 和 k 的某些值,并且某些元素也被不必要地区分了。

我决定试一试我的问题中列出的想法(命中计数表),它似乎给出了很好的结果。然而,该解决方案还需要不满足按需奖励要求的生成后排序。对于合理的 n 和 k,这应该是可行的。

诚然,我发现我的算法有时似乎更喜欢导致一个元素连续出现的组合,这可能是可以调整的。但截至目前,我的算法可能不如 tucuxi 的(7,4)具体。然而,它确实为每个 n、k 提供了一个解决方案,并且似乎更少区分元素。

我的代码在下面列出。

再次感谢。

public class CombinationGenerator {
    private final int N;
    private final int K;
    private final int[] mCurrentCombination;

    public CombinationGenerator(int n, int k) {
        N = n;
        K = k;
        mCurrentCombination = new int[k];

        setElementSequence(0, 0);
        mCurrentCombination[K - 1]--; // fool the first iteration
    }

    private void setElementSequence(int startPos, int startValue) {
        for (int i = startPos; i < K; i++) {
            mCurrentCombination[i] = i + startValue - startPos;
        }
    }

    public int[] getNextCombination() {
        for (int i = K - 1; i >= 0; i--) {
            if (mCurrentCombination[i] < i + N - K) {
                setElementSequence(i, mCurrentCombination[i] + 1);
                return mCurrentCombination.clone();
            }
        }

        return null;
    }   
}

public class CombinationSorter {
    private final int N;
    private final int K;

    public CombinationSorter(int n, int k) {
        N = n;
        K = k;
    }

    public List<int[]> getSortedCombinations(List<int[]> combinations) {
        List<int[]> sortedCombinations = new LinkedList<int[]>();
        int[] combination = null;
        int[] hitCounts = new int[N]; // how many times each element has been
                                      // used so far

        // Note that this modifies (empties) the input list
        while (!combinations.isEmpty()) {
            int index = findNextCombination(combinations, hitCounts, combination);
            combination = combinations.remove(index);
            sortedCombinations.add(combination);

            addHitCounts(combination, hitCounts);
        }

        return sortedCombinations;
    }

    private int findNextCombination(List<int[]> combinations, int[] hitCounts,
            int[] previousCombination) {
        int lowestHitCount = Integer.MAX_VALUE;
        int foundIndex = 0;

        // From the remaining combinations, find the one with the least used
        // elements
        for (int i = 0; i < combinations.size(); i++) {
            int[] comb = combinations.get(i);
            int hitCount = getHitCount(comb, hitCounts);

            if (hitCount < lowestHitCount) {
                lowestHitCount = hitCount;
                foundIndex = i;
            } else if (hitCount == lowestHitCount
                    && previousCombination != null
                    && getNrCommonElements(comb, previousCombination) < getNrCommonElements(
                            combinations.get(foundIndex), previousCombination)) {
                // prefer this combination if hit count is equal but it's more
                // different from the previous combination in our sorted list
                // than what's been found so far (avoids consecutive element
                // appearances)
                foundIndex = i;
            }
        }

        return foundIndex;
    }

    private int getHitCount(int[] combination, int[] hitCounts) {
        int hitCount = 0;

        for (int i = 0; i < K; i++) {
            hitCount += hitCounts[combination[i]];
        }

        return hitCount;
    }

    private void addHitCounts(int[] combination, int[] hitCounts) {
        for (int i = 0; i < K; i++) {
            hitCounts[combination[i]]++;
        }
    }

    private int getNrCommonElements(int[] c1, int[] c2) {
        int count = 0;

        for (int i = 0; i < K; i++) {
            for (int j = 0; j < K; j++) {
                if (c1[i] == c2[j]) {
                    count++;
                }
            }
        }
        return count;
    }
}

public class Test {
    public static void main(String[] args) {
        final int n = 7;
        final int k = 3;

        CombinationGenerator cg = new CombinationGenerator(n, k);
        List<int[]> combinations = new LinkedList<int[]>();
        int[] nc;

        while ((nc = cg.getNextCombination()) != null) {
            combinations.add(nc);
        }

        for (int[] c : combinations) {
            System.out.println(Arrays.toString(c));
        }

        System.out.println("Sorting...");

        CombinationSorter cs = new CombinationSorter(n, k);
        List<int[]> sortedCombinations = cs.getSortedCombinations(combinations);

        for (int[] sc : sortedCombinations) {
            System.out.println(Arrays.toString(sc));
        }
    }

}

结果(7,4)

[0, 1, 2, 3]
[0, 4, 5, 6]
[1, 2, 3, 4]
[0, 1, 5, 6]
[2, 3, 4, 5]
[0, 1, 2, 6]
[3, 4, 5, 6]
[0, 1, 2, 4]
[0, 3, 5, 6]
[1, 2, 4, 5]
[0, 1, 3, 6]
[2, 4, 5, 6]
[0, 1, 3, 4]
[2, 3, 5, 6]
[0, 1, 4, 5]
[0, 2, 3, 6]
[1, 3, 4, 5]
[0, 2, 4, 6]
[1, 2, 3, 5]
[0, 1, 4, 6]
[0, 2, 3, 5]
[1, 2, 4, 6]
[1, 3, 5, 6]
[0, 2, 3, 4]
[1, 2, 5, 6]
[0, 3, 4, 5]
[1, 2, 3, 6]
[0, 2, 4, 5]
[1, 3, 4, 6]
[0, 2, 5, 6]
[0, 1, 3, 5]
[2, 3, 4, 6]
[1, 4, 5, 6]
[0, 1, 2, 5]
[0, 3, 4, 6]

结果(5,2)

[0, 1]
[2, 3]
[0, 4]
[1, 2]
[3, 4]
[0, 2]
[1, 3]
[2, 4]
[0, 3]
[1, 4]
于 2015-03-11T22:07:05.540 回答