algorithm - 对一串括号进行分区

Question

谁能帮我解决这个问题？我将输入问题，然后给出我的一些想法/替代解决方案。

所以问题几乎是，给定一串这样的括号：

[[]]

我们要为每个括号分配一个组号（组 1 或组 2）。一个有效的分配意味着如果你只看第一组中的括号，它会形成一个有效的、平衡的括号字符串（这几乎是像 [ ] [ [ ] ] 和不像 ]]]][ ] 这样的东西。同样有对第二组也是如此。组不必是连续的。我们想计算将这些括号分成两组的方法。

在 [ [ ] ] 上面的示例字符串中，答案是 6，以下是枚举：(1 = group 1, 2 = group 2)

  [[]]
1.1111
2.2222
3.1221
4.2112
5.1212
6.2121

安排不必包括所有组（如安排 1. 和 2.）

想法

一个明显的蛮力解决方案可以很快地使用多达 32 个括号，它是使用一个 32 位整数来表示哪些括号是单个组的一部分。或者我们可以使用一个数组。运行时间是 O(2^N) （我认为），这太慢了？

从问题的角度来看，我认为给你的原始括号串必须是预先平衡的，否则没有办法选择一个子集，使第 1 组和第 2 组是平衡的。

我还注意到您可以分离组件 - 字符串“[]”有 2 个排列，所以字符串“[][]”有 4 个排列。（您可以找到每个组件中的方式数并将它们相乘）。

不过，我对如何将这些想法融入算法感到困惑。我写了蛮力程序，我检查了字符串“[]”、“[[]]”、“[[[]]]”和“[[[[]]]]”，但我真的没有看到一个模式。

通过将这些字符串插入我的蛮力程序，我得到：

"[]" = 2
"[[]]" = 6
"[[]]" = 20
"[[[[]]]]" = 70

代码：

char buf[1000];
int N;
bool isValid(int mask)
{
    int lv = 0;
    for (int i = 0; i < N; i++)
    {
        if (mask & (1 << i))
        {
            if (buf[i] == '(')
            {
                lv++;
            }
            else
            {
                lv--;
            }
            if (lv<0)
            {
                return false;
            }
        }
    }
    return lv==0;
}

int main() 
{
    scanf("%s", buf);
    N = strlen(buf);
    int ways = 0;
    for (int i = 0; i < (1 << N); i++)
    {
        if (isValid(i) && isValid(~i))
        {
            ways++;
        }
    }
    printf("Number of ways is %d\n", ways);
    return 0;
}

score 3 · Accepted Answer

我在下面的 C++ 中给出了一个 O(n^3) 时间、O(n^2) 空间的动态规划解决方案。 但是这个算法的理由首先需要一些解释。在下文中，我使用“子串”来表示必须连续的元素的有序子集，而“子序列”则表示不必连续的有序子集。

生成我们知道有效的字符串

将字符串的深度定义为[它包含的 s 数减去]s 的数量。

让我们建立一些所有有效（“平衡”）括号字符串必须遵守的规则：

[s 和s 的数量必须相等]。
字符串的任何前缀都不能有负深度，即]s 多于[s。

这些显然是必要的条件——如果一个字符串违反了任一规则，它就不能是有效的。但是为了能够方便地生成我们知道是有效的字符串，我们需要证明这些条件也是充分的：任何遵守这些规则的字符串都必须是有效的。为了解决这个问题，让我们引入一个引理：

引理：如果非空字符串满足条件 (1) 和 (2)，则它必须包含[]为子字符串。

证明：它必须以 a 开头[，否则长度为 1 的前缀将包含]比s 更多的[s 并违反 (2)。因此它必须至少包含一个]，否则会有 i >= 1[和 0 ]，并且有效的字符串必须包含相等数量的每个 (1)。因此，在某个位置 j > 1 处必须第一次出现 a ]，并且其左侧的字符必须是 a [。

假设我们有一个x符合条件 (1) 和 (2) 的非空字符串。根据引理，它必须包含一个[]. 删除这对不会导致违反这些条件中的任何一个，因此结果字符串，如果非空，则必须仍遵守条件 (1) 和 (2)，因此仍必须包含[]某处。因此我们可以继续删除[]s 直到剩下空字符串。

将 a 插入[]到任何位置的有效字符串中都必须产生一个新的有效字符串，因为新的括号对总是相互匹配并且不会干扰任何其他匹配的对。x观察到可以通过以与我们在上一段中删除它们相反的顺序将 s重复插入空字符串（这是平凡有效的）来构建我们的原始字符串[]：所以我们现在已经证明了x（即任何遵守条件 (1) 和 (2)) 有效。

正确的递归

表达 OP 问题的等效方式是：“我们可以选择多少种方法来选择字符位置的有效子序列，以便剩余的子序列也有效？” 如果我们首先将其概括为：

假设到目前为止我们选择的子序列有 depth d，到目前为止我们未选择的子序列有 depth e，我们有多少种方法可以从 position 开始的后缀中选择一个有效的子序列，k以便剩余的子序列也是有效的？

调用这个函数count(d, e, k)。原来问题的答案是现在count(0, 0, 0)。

事实上，我们可以通过注意d+e必须等于字符后的总深度来进一步简化问题k，因此我们可以e从d和确定k，并且count()只需要 2 个参数。

此外，在测试是否可以选择空子序列时，我们只需要测试d== 0。我们不需要费心测试那个e加上剩余的后缀在不低于它的情况下下降到 0，因为 if d== 0然后我们从原始字符串中减去了 0 的净深度（假设它是有效的，它必须以 0 的深度结束，并且不能低于 0）。

为了递归地解决这个问题，我们需要从搜索所有可能的子序列的过程中分离出第一个决策点。长度为 n 的字符串的子序列必须属于以下 n+1 种情况之一：要么为空，要么具有最左边的元素，该元素可以是字符串中的 n 个字符中的任何一个。在做出第一个决定之后通过递归产生的子序列都是不同的。

在递归正常工作的情况下，记住它很简单：只需在 2D 向量中记录任何给定调用的正确答案，该向量memo[][]最初用 -1 值填充。由于该函数count(d, k)有 2 个参数，对于长度为 n 的字符串，参数范围分别从 0 到 n/2 和从 0 到 n，因此需要 O(n^2) 空间，并且if (memo[d][k] == -1) {块内部最多会执行O(n^2) 次。每次发生这种情况时，都会运行一个 O(n) 循环，时间复杂度高达 O(n^3)。

实际代码

#include <iostream>
#include <vector>
#include <string>

using namespace std;

class PartitionCounter {
    // Return the number of subsequences of the suffix of v beginning at position k
    // that are (a) valid, given that the initial depth of the subsequence is d (on
    // account of it being the suffix of some larger subsequence), and (b)
    // leave behind a remainder subsequence that is also valid, given that
    // the remainder sequence has initial depth depths[k]-d.
    int count(int d, int k) {
        // If a prefix of either sequence (selected or remaining) has more ']'s
        // than '['s then there can't be any completing subsequences.
        if (d < 0 || depths[k] - d < 0) {
            return 0;
        }

        // Only compute the answer if we haven't already.
        if (memo[d][k] == -1) {
            // A subsequence must either contain no elements, or a leftmost element
            // at some position.  All subsequences produced by recursion after this
            // initial choice are distinct (when considering the sequence of
            // character indices included, though not necessarily when considering
            // the sequence of characters themselves).

            // Try including no elements.  This effectively terminates the larger
            // subsequence that the selected subsequence is part of, so it can be
            // legal only if its depth is 0.  It also effectively includes all
            // remaining characters in the remainder sequence, but if the selected
            // subsequence has depth 0 and the original string does too, then it's
            // implied that the remainder must also have total depth 0, so we don't
            // need to check it.
            int n = (d == 0);

            // Try including a leftmost element at each remaining position.
            // If this would cause a remainder subsequence that has negative
            // depth, stop: any later loop iterations would also create illegal
            // remainder subsequences.
            for (int i = k; i < v.size() && depths[i] - d >= 0; ++i) {
                n += count(d + v[i], i + 1);
            }

            memo[d][k] = n;
        }

        return memo[d][k];
    }

    vector<int> v;          // 1 for '[', -1 for ']'
    vector<int> depths;     // depths[i] is the sum of the 1st i elements
    vector<vector<int> > memo;  // DP matrix.  -1 => not computed yet

public:
    PartitionCounter(string s) : memo(s.size() / 2 + 1, vector<int>(s.size() + 1, -1)) {
        depths.push_back(0);
        int total = 0;
        for (int i = 0; i < s.size(); ++i) {
            v.push_back(1 - 2 * (s[i] == ']')); // Map '[' to 1 and ']' to -1
            depths.push_back(total += v[i]);
        }
    }

    int count() {
        if (depths.back() == 0) {
            return count(0, 0);
        } else {
            return 0;       // Need to handle invalid strings specially
        }
    }
};

int main(int argc, char **argv) {
    PartitionCounter c(argv[1]);
    cout << c.count() << '\n';
}

结果

C:\>partitioncounter []
2

C:\>partitioncounter [[]]
6

C:\>partitioncounter [[[]]]
20

C:\>partitioncounter [[[[]]]]
70

C:\>stopwatch partitioncounter [][[[[[][][][][[][][]]]]]][]
10001208
stopwatch: Terminated. Elapsed time: 15ms
stopwatch: Process completed with exit code 0.

C:\>stopwatch partitioncounter [][[[[[][[[]][][][[]]][[][]]]]]][]
562547776
stopwatch: Terminated. Elapsed time: 0ms
stopwatch: Process completed with exit code 0.

如果您需要额外的位，您当然可以使用long long或其他方式来代替。int

编辑：修正了 ishi 指出的错误。当我们跳过要从选定子序列中排除的字符时，剩余子序列会累积它们。发生的事情是，到目前为止，我们实际上只排除了在整个子序列上]s 多于s 的剩余子序列——但为了避免违反条件 (2)，我们需要检查对于细绳。我们现在通过提前停止循环来做到这一点，这样这些违规的剩余子序列就不会首先生成。作为奖励，算法变得更快！:)[

score 2 · Accepted Answer

如果有任何帮助，对于像 [[[]]] 这样的“中心”字符串，您可以使用ways(1) = 2和ways(n) = ways(n-1)*(4*n-2)/n（或C(2n,n)，如果您愿意）计算您的方法数，其中 n 是嵌套的深度。

嵌套但不是“中心”组（如 [[][]]）似乎遵循类似的模式，但我无法找出正确的公式。

编辑

我们的符号能力已经用完了，所以我将使用 texify 来表达数学公式。我想出了这样的事情：

周围组（您可以通过此更改公式）。

score 2 · Accepted Answer

扩展 ishi 的答案，我认为它可以在 O(N^2) 中完成，因为 d + e 等于前缀的深度。下面的代码得到相同的结果。

/*
How many ways are there to split a string
of brackets into two such that both are balanced.
*/

#include <iostream>
#include <string>
using namespace std;

#define MAXN 1000
#define M 1000000007

int N, dp[MAXN][MAXN];
string s;

int recurse(int k, int i, int p) {
  if (i >= N) {
    return k == 0 ? 1 : 0;
  }
  if (k < 0 || p-k < 0) {
    return 0;
  }

  int &ans = dp[k][i];
  if (ans != -1) {
    return ans;
  }

  ans = 0;
  if (s[i] == '[') {
    ans += recurse(k+1,i+1,p+1)+recurse(k,i+1,p+1);
    return ans;
  }
  if (s[i] == ']') {
    ans += recurse(k-1,i+1,p-1)+recurse(k,i+1,p-1);
    return ans;
  }
  return 0;
}

int main() {
  cin >> s;
  N = s.size();
  for (int k = 0; k < N; k++) {
    for (int i = 0; i < N; i++) {
      dp[k][i] = -1;
    }
  }
  cout << recurse(0,0,0) << endl;
}

score 1 · Accepted Answer

更容易理解解决方案。

  public int splitString(String s){
    int gOneC=0;
    int gTwoC=0;

    Map<String, Integer> memo = new HashMap<>();
    return splitStringRecur(s,0, gOneC, gTwoC, memo);
}

private int splitStringRecur(String s, int i, int gOneC, int gTwoC, Map<String, Integer> memo) {
    if(i == s.length()){
        if(gOneC==0 || gTwoC==0) return 1;
    }
    String t = i+"-"+gOneC+"-"+gTwoC+"";
    if(memo.containsKey(t)) return memo.get(t);

    int gc =0;
    int gs =0;
    if(s.charAt(i)=='('){
        gc = splitStringRecur(s, i+1, gOneC+1, gTwoC,memo);
        gs = splitStringRecur(s, i+1, gOneC, gTwoC+1,memo);
    }else {
        if(gOneC > 0){
          gc +=  splitStringRecur(s, i+1, gOneC-1, gTwoC,memo);
        }
        if( gTwoC >0){
           gs += splitStringRecur(s, i+1, gOneC, gTwoC-1,memo);
        }
    }
    memo.put(t, gc+gs);
    return gc+gs;
}

algorithm - 对一串括号进行分区

4 回答 4

生成我们知道有效的字符串

正确的递归

实际代码

结果

Related

Reference