0

I'm studying a Burrows-Wheeler transformation and so far I can get it from some Text. Now it's time for the reverse process and that's what I have trouble with.

Here's the input: TTCCTAACG$A.

That's my way of thinking:

1) compute the number of As, Cs, Gs, Ts in the input: A: 3, C: 3, G: 1, T: 3

2) let's write down the First and the Last column of Burrows-Wheeler transformation. The last column is our input. So here it is:

      F    L

[0]   $    T
[1]   A    T
[2]   A    C
[3]   A    C
[4]   C    T
[5]   C    A
[6]   C    A
[7]   G    C
[8]   T    G
[9]   T    $
[10]  T    A

Here's my logic:

  1. Initially, output = '$'
  2. L[0] = 'T' => output = 'T$'
  3. The first T in F has the index 8 => we need L[8] => output = 'GT$'
  4. The first G in F has the index 7 => we need L[7] => output = 'CGT$'
  5. The first C in F has the index 4 => we need L[4] => output = 'TCGT$'
  6. It was our second T. The second T in F has the index 9, but L[9] = '$', thus
    we should stop.

Obviously, it's not over and something's wrong here. Could you please explain what?

4

2 回答 2

0

我对这种方法的理解过于简单。在第 4 步中,由于 C 是第三个 C,我们需要 F[6]。

于 2016-08-04T08:20:36.693 回答
0

最后一列看起来不对 - 它应该是符号前面的第一列。您也不要使用 BWT 的特殊符号。这样,之前的规则就被打破了,你会扰乱你的 lf 映射。

D.

于 2016-09-15T14:12:48.733 回答