We introduce numpy and show some of its properties. Then we solve the birthday problem and simulate Chuck-A-Luck.
%pylab inline
Beware when doing calculations involving division, especially using integers.
print 1/6, 1//6, 1./6, 1/6., 5/4, 5//4, 5./4, 5/4., 5.//4, 5.//4.
The import below isn't necessary as the pylab command above imports everything from both numpy and pylab. We include just to show the command.
from numpy import *
Vectors and matrices are numpy arrays.
l = [1,2,3] #python list
a = array(l)
a
print a.shape, a.size, a.dtype, a.ndim
z = zeros(3)
x = ones(3)
z,x
Arithmetic is elementwise.
a+1, a*2, a+x, a*a, exp(a), a>2, a*(a>2)
2D Arrays
A = arange(12).reshape(4,3)
print A
print A.shape, A.size, A.dtype, A.ndim
2*A
Broadcasting: if possible, smaller arrays extended to bigger array.
A+a, A*a
Multiplication is element-wise. To get the linear algebra multiplication, use the dot function.
A*A
dot(A,x)
dot(A.T,A)
X = eye(3)+dot(ones(3),ones(3).T)
X
inv(X), dot(X,inv(X))
n = 365
days = arange(1,n+1)
p = 1.0*(n+1-days)/n
probnopair = cumprod(p)
sum(probnopair>0.5)
xlim = 30
cutoff = 23
plot(days[:xlim],probnopair[:xlim],days[:xlim],probnopair[:xlim],'.')
axvline(x=cutoff,color='r')
axhline(y=0.5)
xlabel('Number of People in Group')
ylabel('Probability of No Match')
probnopair[21],probnopair[22] #python arrays start from 0, not 1.
Chuck-A-Luck is a dice game that is sometimes played at carnivals and charity events. The player pays 1 USD, selects a number between 1 and 6, and rolls 3 dice. If the player's number comes up once, the player wins 1 USD; if it comes up twice, the player wins 2 USD; and if thrice, wins 3 USD (in one variation, 10 USD).
(Sorry for the European notation, but there doesn't seem to be a way of getting dollar signs properly displayed. Ipython uses dollar signs to delimit inline math, e.g., \(x^2+y^2=z^2\).)
dice = 3
throws = 1000
p = 1./6
brv = stats.bernoulli.rvs
rolls = brv(p,size=dice*throws).reshape(throws,dice)
counts = sum(rolls,1)
We use what python calls "fancy indexing" to transform the number of correct dice in each throw to the amount won or lost on each throw. It uses an implied for loop over the entries in counts. If the i'th entry in counts is k, then payouts[k] is selected. (You can also use the numpy take function to get the same effect.)
payouts = array([-1,1,2,10])
#wins = take(payouts,counts) #alternative function
wins = payouts[counts]
totals = cumsum(wins)
plot(totals)
axhline(y=0.0)
averages = 1.0 * totals/arange(1,throws+1)
plot(averages)
axhline(y=0.0)
averages[-1]
Charles Boncelet, 25 Feb 2014