1. Test the Enviroment
1.1 Simulation of a Brownian Motion
The purpose of the first notebook entry is to check if matplotlib is correctly installed. We simulate 20 Brownian Motions at [0,1] evaluated at 500 points
import numpy
random_color = lambda: '#%02x%02x%02x' % tuple(np.random.randint(0,256,3))
fig = plt.figure()
ax = fig.add_subplot(111)
T=500
N=20
times=np.true_divide( numpy.arange(0, T) ,T)
for i in range(0, N):
t = ax.plot(times , cumsum(random.normal(0,sqrt(true_divide(1,T)),T)), lw=1, c=random_color())
The result should somewhat look like this
1.2 Validation of the Erdős–Kac theorem
I have a lifelong passion for prime numbers, therefore in this simple Spark Program we will try to validate the Erdős–Kac theorem in a finite sample setting. The theorem states that if
is the is the number of distinct prime factors, then for any fixed
,
(1) 
where
is the standard normal distribution.
def prime_factors(n):
i = 2
p=n
factors = []
while i * i <= n: if n % i: i += 1 else: n //= i factors.append(i) if n > 1:
factors.append(n)
dist= ( len( unique(factors) ) -log(log(p)))/sqrt(log(log(p)))
return dist
N=500000
bins=6
nums = sc.parallelize(xrange(3,N))
result=nums.map(prime_factors).histogram(bins)
binsize=mean( diff( result[0] ) )
axis2=np.linspace(-3, 3, num=128)
mu, sigma = 0, 1 # mean and standard deviation
plt.plot(axis2, 1/(sigma * np.sqrt(2 * np.pi)) *np.exp( - (axis2 - mu)**2 / (2 * sigma**2) ),linewidth=2, color='r')
plt.bar(result[0][0:bins],true_divide(result[1],N),binsize)
plt.show()

