This notebook presents simple Python code for a binary input BAC (Block Arithmetic Code) encoder and decoder. This code demonstrates how BAC works. The code can be used as templates for more advanced code.
Charles Boncelet, boncelet@udel.edu, 3/31/2017
#python 3.6
import numpy as np
import matplotlib.pylab as plt
from numpy.random import rand
%matplotlib inline
#utility function
def entropy(p):
"""binary entropy function with parameter p"""
return -p*np.log2(p)-(1-p)*np.log2(1-p)
This implementation is for a simple binary encoder where the bits are IID with common probability $p=\Pr(x=1)$.
NOTE: a real encoder has to deal with the EOF problem (telling the decoder when the input has ended). There are straightforward ways of dealing with this, but we're striving for simplicity here and, so, omit them. In effect, we assume the encoder/decoder pair has another way to exchange the file length.
def BAC_Encoder(xs,p,k=2**16):
"""Binary BAC Encoder with k codewords
xs = list of bits
p = probability x=1
k = number of codewords
"""
first = 0
phrase = []
while k>1 and xs:
k1 = round(p*k)
if k1 == 0:
k1 = 1
if k1 == k:
k1 = k-1
k0 = k-k1
x = xs.pop()
phrase.append(x)
if x == 0:
k = k0
else:
first += k0
k = k1
#print(first, k)
return first, phrase, xs
def BAC_Decoder(codeword, p, k=2**16):
"""Decode BAC codeword back to bit string
p = probability x=1
k = size of codebook
"""
first = 0
phrase = []
while k>1:
k1 = round(p*k)
if k1 == 0:
k1 = 1
if k1 == k:
k1 = k-1
k0 = k-k1
if codeword < first + k0:
phrase.append(0)
k = k0
else:
phrase.append(1)
first += k0
k = k1
#print(first, k)
return phrase
p = 0.3
n = 100
xs = list((rand(n)<p).astype('int'))
codeword, phrase, xs = BAC_Encoder(xs,p)
print(codeword, len(phrase), len(xs)), phrase
BAC_Decoder(codeword,0.3)
Let's test the Encoder and Decoder. Obviously, the Decoder should reverse the Encoder. This isn't a proof, but a pretty good test.
ntrials = 10000
p = 0.95
n = 1000
for i in range(ntrials):
xs = list((rand(n)<p).astype('int'))
codeword, phrase, xs = BAC_Encoder(xs,p)
newphrase = BAC_Decoder(codeword,p)
if phrase != newphrase:
print('Ding DING DING!, i = ', i)
break
else:
print('Success')
Here's an example with a 48 bit codebook and $p=0.95$. An exact calculation shows EL($2^{48}$)=165.552 bits per input phrase.
ntrials = 10000
p = 0.95
nbits = 600
k = 2**48
lengths = []
for i in range(ntrials):
xs = list((rand(nbits)<p).astype('int'))
codeword, phrase, xs = BAC_Encoder(xs,p,k=k)
lengths.append(len(phrase))
np.average(lengths), np.var(lengths), np.min(lengths), np.max(lengths)
Boncelet, Charles G. "Block arithmetic coding for source compression." IEEE transactions on information theory 39.5 (1993): 1546-1554.
Boncelet, C.G., 1993, January. Block arithmetic coding for Markov sources. In Information Theory, 1993. Proceedings. 1993 IEEE International Symposium on (pp. 118-118). IEEE.
@article{boncelet1993block,
title={Block arithmetic coding for source compression},
author={Boncelet, Charles G},
journal={IEEE transactions on information theory},
volume={39},
number={5},
pages={1546--1554},
year={1993},
publisher={IEEE}
}
@inproceedings{boncelet1993block,
title={Block arithmetic coding for Markov sources},
author={Boncelet, CG},
booktitle={Information Theory, 1993. Proceedings. 1993 IEEE International Symposium on},
pages={118--118},
year={1993},
organization={IEEE}
}