我正在尝试在 Python 中实现一个 Reed-Solomon 编码器-解码器,支持对擦除和错误的解码,这让我发疯。
该实现目前支持仅解码错误或仅擦除,但不能同时解码两者(即使它低于 2*errors+erasures <= (nk) 的理论界限)。
从 Blahut 的论文(这里和这里)看来,我们只需要用擦除定位多项式初始化错误定位多项式,就可以隐式计算 Berlekamp-Massey 内的勘误定位多项式。
这种方法部分适用于我:当我有 2*errors+erasures < (nk)/2 时它可以工作,但实际上在调试之后它才有效,因为 BM 计算了一个错误定位器多项式,它得到与擦除定位器多项式完全相同的值(因为我们低于仅错误校正的限制),因此它通过 galois 域被截断,我们最终得到了擦除定位多项式的正确值(至少我是这样理解的,我可能是错的)。
但是,当我们超过 (nk)/2 时,例如如果 n = 20 且 k = 11,则我们可以纠正 (nk)=9 个擦除符号,如果我们输入 5 个擦除,那么 BM 就会出错。如果我们输入 4 个擦除 + 1 个错误(我们仍然远低于界限,因为我们有 2*errors+erasures = 2+4 = 6 < 9),BM 仍然会出错。
我实现的 Berlekamp-Massey 的确切算法可以在这个演示文稿中找到(第 15-17 页),但是可以在此处和此处找到非常相似的描述,在这里我附上数学描述的副本:
现在,我将这个数学算法几乎完全复制到了 Python 代码中。我想要扩展它以支持擦除,我尝试使用擦除定位器初始化错误定位器 sigma:
def _berlekamp_massey(self, s, k=None, erasures_loc=None):
'''Computes and returns the error locator polynomial (sigma) and the
error evaluator polynomial (omega).
If the erasures locator is specified, we will return an errors-and-erasures locator polynomial and an errors-and-erasures evaluator polynomial.
The parameter s is the syndrome polynomial (syndromes encoded in a
generator function) as returned by _syndromes. Don't be confused with
the other s = (n-k)/2
Notes:
The error polynomial:
E(x) = E_0 + E_1 x + ... + E_(n-1) x^(n-1)
j_1, j_2, ..., j_s are the error positions. (There are at most s
errors)
Error location X_i is defined: X_i = a^(j_i)
that is, the power of a corresponding to the error location
Error magnitude Y_i is defined: E_(j_i)
that is, the coefficient in the error polynomial at position j_i
Error locator polynomial:
sigma(z) = Product( 1 - X_i * z, i=1..s )
roots are the reciprocals of the error locations
( 1/X_1, 1/X_2, ...)
Error evaluator polynomial omega(z) is here computed at the same time as sigma, but it can also be constructed afterwards using the syndrome and sigma (see _find_error_evaluator() method).
'''
# For errors-and-erasures decoding, see: Blahut, Richard E. "Transform techniques for error control codes." IBM Journal of Research and development 23.3 (1979): 299-315. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.600&rep=rep1&type=pdf and also a MatLab implementation here: http://www.mathworks.com/matlabcentral/fileexchange/23567-reed-solomon-errors-and-erasures-decoder/content//RS_E_E_DEC.m
# also see: Blahut, Richard E. "A universal Reed-Solomon decoder." IBM Journal of Research and Development 28.2 (1984): 150-158. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.2084&rep=rep1&type=pdf
# or alternatively see the reference book by Blahut: Blahut, Richard E. Theory and practice of error control codes. Addison-Wesley, 1983.
# and another good alternative book with concrete programming examples: Jiang, Yuan. A practical guide to error-control coding using Matlab. Artech House, 2010.
n = self.n
if not k: k = self.k
# Initialize:
if erasures_loc:
sigma = [ Polynomial(erasures_loc.coefficients) ] # copy erasures_loc by creating a new Polynomial
B = [ Polynomial(erasures_loc.coefficients) ]
else:
sigma = [ Polynomial([GF256int(1)]) ] # error locator polynomial. Also called Lambda in other notations.
B = [ Polynomial([GF256int(1)]) ] # this is the error locator support/secondary polynomial, which is a funky way to say that it's just a temporary variable that will help us construct sigma, the error locator polynomial
omega = [ Polynomial([GF256int(1)]) ] # error evaluator polynomial. We don't need to initialize it with erasures_loc, it will still work, because Delta is computed using sigma, which itself is correctly initialized with erasures if needed.
A = [ Polynomial([GF256int(0)]) ] # this is the error evaluator support/secondary polynomial, to help us construct omega
L = [ 0 ] # necessary variable to check bounds (to avoid wrongly eliminating the higher order terms). For more infos, see https://www.cs.duke.edu/courses/spring11/cps296.3/decoding_rs.pdf
M = [ 0 ] # optional variable to check bounds (so that we do not mistakenly overwrite the higher order terms). This is not necessary, it's only an additional safe check. For more infos, see the presentation decoding_rs.pdf by Andrew Brown in the doc folder.
# Polynomial constants:
ONE = Polynomial(z0=GF256int(1))
ZERO = Polynomial(z0=GF256int(0))
Z = Polynomial(z1=GF256int(1)) # used to shift polynomials, simply multiply your poly * Z to shift
s2 = ONE + s
# Iteratively compute the polynomials 2s times. The last ones will be
# correct
for l in xrange(0, n-k):
K = l+1
# Goal for each iteration: Compute sigma[K] and omega[K] such that
# (1 + s)*sigma[l] == omega[l] in mod z^(K)
# For this particular loop iteration, we have sigma[l] and omega[l],
# and are computing sigma[K] and omega[K]
# First find Delta, the non-zero coefficient of z^(K) in
# (1 + s) * sigma[l]
# This delta is valid for l (this iteration) only
Delta = ( s2 * sigma[l] ).get_coefficient(l+1) # Delta is also known as the Discrepancy, and is always a scalar (not a polynomial).
# Make it a polynomial of degree 0, just for ease of computation with polynomials sigma and omega.
Delta = Polynomial(x0=Delta)
# Can now compute sigma[K] and omega[K] from
# sigma[l], omega[l], B[l], A[l], and Delta
sigma.append( sigma[l] - Delta * Z * B[l] )
omega.append( omega[l] - Delta * Z * A[l] )
# Now compute the next B and A
# There are two ways to do this
# This is based on a messy case analysis on the degrees of the four polynomials sigma, omega, A and B in order to minimize the degrees of A and B. For more infos, see https://www.cs.duke.edu/courses/spring10/cps296.3/decoding_rs_scribe.pdf
# In fact it ensures that the degree of the final polynomials aren't too large.
if Delta == ZERO or 2*L[l] > K \
or (2*L[l] == K and M[l] == 0):
# Rule A
B.append( Z * B[l] )
A.append( Z * A[l] )
L.append( L[l] )
M.append( M[l] )
elif (Delta != ZERO and 2*L[l] < K) \
or (2*L[l] == K and M[l] != 0):
# Rule B
B.append( sigma[l] // Delta )
A.append( omega[l] // Delta )
L.append( K - L[l] )
M.append( 1 - M[l] )
else:
raise Exception("Code shouldn't have gotten here")
return sigma[-1], omega[-1]
多项式和 GF256int 分别是 2^8 上的多项式和伽罗瓦域的通用实现。这些类是单元测试的,并且它们通常是错误证明的。Reed-Solomon 的其他编码/解码方法也是如此,例如 Forney 和 Chien 搜索。我在这里谈论的问题的快速测试用例的完整代码可以在这里找到:http ://codepad.org/l2Qi0y8o
这是一个示例输出:
Encoded message:
hello world�ꐙ�Ī`>
-------
Erasures decoding:
Erasure locator: 189x^5 + 88x^4 + 222x^3 + 33x^2 + 251x + 1
Syndrome: 149x^9 + 113x^8 + 29x^7 + 231x^6 + 210x^5 + 150x^4 + 192x^3 + 11x^2 + 41x
Sigma: 189x^5 + 88x^4 + 222x^3 + 33x^2 + 251x + 1
Symbols positions that were corrected: [19, 18, 17, 16, 15]
('Decoded message: ', 'hello world', '\xce\xea\x90\x99\x8d\xc4\xaa`>')
Correctly decoded: True
-------
Errors+Erasures decoding for the message with only erasures:
Erasure locator: 189x^5 + 88x^4 + 222x^3 + 33x^2 + 251x + 1
Syndrome: 149x^9 + 113x^8 + 29x^7 + 231x^6 + 210x^5 + 150x^4 + 192x^3 + 11x^2 + 41x
Sigma: 101x^10 + 139x^9 + 5x^8 + 14x^7 + 180x^6 + 148x^5 + 126x^4 + 135x^3 + 68x^2 + 155x + 1
Symbols positions that were corrected: [187, 141, 90, 19, 18, 17, 16, 15]
('Decoded message: ', '\xf4\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00P\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe3\xe6\xffO> world', '\xce\xea\x90\x99\x8d\xc4\xaa`>')
Correctly decoded: False
-------
Errors+Erasures decoding for the message with erasures and one error:
Erasure locator: 77x^4 + 96x^3 + 6x^2 + 206x + 1
Syndrome: 49x^9 + 107x^8 + x^7 + 109x^6 + 236x^5 + 15x^4 + 8x^3 + 133x^2 + 243x
Sigma: 38x^9 + 98x^8 + 239x^7 + 85x^6 + 32x^5 + 168x^4 + 92x^3 + 225x^2 + 22x + 1
Symbols positions that were corrected: [19, 18, 17, 16]
('Decoded message: ', "\xda\xe1'\xccA world", '\xce\xea\x90\x99\x8d\xc4\xaa`>')
Correctly decoded: False
在这里,擦除解码总是正确的,因为它根本不使用 BM 来计算擦除定位器。通常,其他两个测试用例应该输出相同的 sigma,但它们根本不会。
当您比较前两个测试用例时,问题来自 BM 的事实在这里是公然的:校正子和擦除定位器是相同的,但得到的 sigma 完全不同(在第二个测试中,使用了 BM,而在不调用仅擦除 BM 的第一个测试用例)。
非常感谢您对我如何调试它的任何帮助或任何想法。请注意,您的答案可以是数学或代码,但请解释我的方法出了什么问题。
/EDIT:仍然没有找到如何正确实现勘误表BM解码器(请参阅下面的答案)。赏金提供给任何可以解决问题(或至少引导我找到解决方案)的人。
/ EDIT2:愚蠢的我,对不起,我刚刚重新阅读了架构,发现我错过了作业中的更改L = r - L - erasures_count
......我已经更新了代码来解决这个问题并重新接受了我的答案。