cwave.eu5.org
Also see: http://www.angelfire.com/dragon/letstry
cwave04 at yahoo dot com
Free Guestbook
My Guestbook

Last updated on Fri May 21 11:52:16 IST 2010.

More on KS procedure

Asymptotic distribution of KS statistics

Computing the exact distribution of the KS statistic is prohibitively difficult for large n. In this case we use the asymptotic distributions of D, D+ and D-.

Theorem Suppose we have computed D based on iid sample of size n. Then
P(sqrt(n) D ≤ x) → H(x),
where
H(x) = 1-2 ∑ j>0 (-1)j-1 exp(-2 j2 x2) if x > 0
0 else

Proof: Not to be done in this course.

H(x) is indeed a valid continuous distribution function. It appears quite difficult to prove that it is indeed so. The simplest proof that I know of is by Feller. It is a few page long (however, it actually proves something stronger.) H(x) is an example of what mathematicians call a theta function.

Theorem Let D+ and D- be computed based on iid sample of ize n. Then
P(sqrt(n) D+) → 1-exp(-2x2) if x ≥ 0
0 else
P(sqrt(n) D- ≤ x) has the same limit, as well.

Proof: Not to be done in this course.

Exercise 6.4: Show that nD+2 is asymptotically distributed as Expo(2).

Exercise 6.5: The file data.txt contains iid data from some unknown continuous distribution, F. We want to test
H0: F = N(0,1) Vs. H1: F N(0,1).
Perform KS test using the asymptotic distribution. Report the P-value.

Confidence intervals using KS procedure

Suppose that X1,...,Xn are iid with unknown continuous distribution G. We want to get a &alpha-level CI for G. For this consider the random variable
D = supx |Fn(x)-G(x)|

Exercise 6.6: Can you compute D if G is not known? Does the distribution of D depend on G?

Find a constant c such that
P(D≤c) = 1-&alpha
i.e., P(|Fn(x)-G(x)| ≤ c for all x) = 1- &alpha
i.e., P(Fn(x)-c ≤ G(x) ≤ Fn(x)+c for all x) = 1- &alpha
This provides a CI for G.

Exercise 6.7: The file data.txt contains iid data from some unknown continuous distribution, F. Obtain a 95% CI for F using 2-sided KS statistic. Plot the CI.
[Hint: You need to compute cut-off point c based on the infinite series given earlier. Use Matlab.]


PrevNext
© Arnab Chakraborty (2010)