You are given $n\in{\mathbb N}$ and $k\in\{0,...,n\}.$ A coin with $P(H)=p$ is tossed
$n$ times independently. Let $f(p)$ denote the probability of having at least
$k$ heads in a stretch. Show that $f(p)$ must be an nondecreasing function of $p.$
The result is very intuitive. But a formal proof is not. Here is a solution that I got from Prof. B V Rao.
Please try to solve the problem yourself first. That way you will relish Prof Rao's solution all the more.
Enough to show that if $p_1 < p_2$ then $f(p_1)\leq f(p_2).$
Let's take $p_1 < p_2.$
Consider a different random experiment with two coins. The first coin has $P(H)=p_1$ and the second coin has $P(H)=x$
(to be determined soon). Toss them in parallel (independently) $n$ times. Create two
sequences of $H$'s and $T$'s out of the outcomes.
The first sequence consists of the outcomes of the first coin. The second sequence has an $H$ if at least one of the two coins shows a head.
Here are two events:
$A = $ the event that there are at least $k$ consecutive $H$'s in the first sequence.
$B = $ the event that there are at least $k$ consecutive $H$'s in the second sequence.
Clearly, $A\subseteq B.$ So $P(A)\leq P(B).$
Now $P(A) = f(p_1)$ and $P(B) = f(p_1+x-p_1x).$ Just choose $x $ so that $p_1+x-p_1x = p_2$ to complete
the proof. Such an $x$ is always possible because $p_1 < p_2.$
Complete sufficien statistics are useful things. These two
concepts are traditionally defined in quite diferent terms, and
it seems a bit strange at they play so well together. Here is a
somewhat less traditional view of these two concepts that bring
out a similarlty between the two concepts.
If a statistic is sufficient then so is any one-one function of
it. Similarly for a complete statistic. This shows that
sufficiency and completeness are actually properties of
partitions of the sample space. We shall illustrate in the case
where we have just one parameter. Let $P$ be a parition of the
sample space. It is called sufficient, if given the block of the
partition where your sample is, you know the exact distribution
of the sample. Clearly, if I give you any finer partition, then
also the property holds. So sufficiency is a property that is
preserved under refinement of a partition.
Now for completeness. Again we start with a partiion, $P.$ I
give you two numbers for each block of this partition, $a_k$
and $b_k$ for the $k$-th block. Average all
the $a_k$'s (weighted avg, weights proportional to the
probabilities of the blocks). You'll get a function of the
parameter. Do the same for the $b_k$'s. You get another
function. If these two functions are the same, then can you say
that the $a_k$'s are the same as the $b_k$'s? Not
necessarily. However, if the parameter space is large enough (or
the number of blocks is small enough), then it might be possible.
In this case we say that the partition is complete. Clearly, a
courser partiion is also complete.
Thus, a partition being sufficient means it is fine enough, and a
partition being complete means it is course enough. A complete,
sufficient partition combines both the aspects. This is much like
a basis being a linearly independent spannng set. If a set is
linearly independent, then so is any of its nonempty subsets. If
a set is spanning, then so is any superset. A basis strikes a
balance.
You'll need to use xelatex (which is part of of any LaTeX
installation like MikTeX). xelatex has a package called fontenc,
which is also part of any standard LaTeX distribution. You'll
also need an opentype unicode Bengali font like SolaimanLipi or
Ekushe. Some sample files (fonts, latex and resulting pdf)
are here.
The following sequential criterion for uniform continuity is
found in many books (including mine), but its proof is almost
always left as an exercise:
Let $D\subseteq{\mathbb R}.$ A function $f:D\rightarrow{\mathbb R}$ is uniformly
continuous $\Leftrightarrow$
$$
\forall (x_n), (y_n)\subseteq D ( x_n-y_n\rightarrow0 \Rightarrow f(x_n)-f(y_n)\rightarrow0).
$$
Here is the proof:
$\Rightarrow$ part: Let $f:D\rightarrow {\mathbb R}$ be uniformly
continuous.
Take any two sequences $(x_n),$ $(y_n)\subseteq D$ such
that $x_n-y_n\rightarrow 0.$ Shall show $f(x_n)-f(y_n)\rightarrow0,$ ie
$$
\forall \epsilon>0~~\exists N\in{\mathbb N}~~\forall n\geq N~~
|f(x_n)-f(y_n)| < \epsilon.
$$
Take any $\epsilon>0.$
Since $f$ is uniformly continuous, hence
$$\exists \delta>0~~\forall x,y (|x-y|< \delta\Rightarrow
|f(x)-f(y)|<\epsilon).$$
Again, since $x_n-y_n\rightarrow0,$
$$
\exists N\in{\mathbb N}~~\forall n\geq N~~|x_n-y_n|<\delta.
$$
Choose this $N.$
Take any $n\geq N.$
Then $|x_n-y_n|<\delta,$ and
so $|f(x_n)-f(y_n)|<\epsilon,$ as required.
$\Leftarrow$ part: To show that $f$ is uniformly
continuous, ie,
$$\forall \epsilon>0 \exists \delta>0~~\forall x,y (|x-y|< \delta\Rightarrow
|f(x)-f(y)|<\epsilon).$$
Let, if possible, this be false. Then its negation must be true:
$$\exists \epsilon>0 \forall \delta>0~~\exists x,y ~~|x-y|<
\delta \mbox{ and }
|f(x)-f(y)|\geq \epsilon.$$
Keep this $\epsilon $ fixed, and put $\delta = \frac1n$
for different $n\in{\mathbb N}.$ For each $n$ you'll
get $x$ and $y.$ Call them $x_n$ and $y_n.$
Then you get two sequences $(x_n),$ $(y_n)$ such
that $|x_n-y_n|<\frac1n$ and $|f(x_n)-f(y_n)|\geq
\epsilon>0.$
Thus, $x_n-y_n\rightarrow0$ but $|f(x_n)-f(y_n)|\not\rightarrow0,$
contradicting the given condition.
Let's start with probit (logistic is similar).
All statistical tools are born out of the need to cope
of some kind of data analysis requirement. Let's learn the
requirement behind probit.
If you give a large enough does of poison to a mouse, it will
die. The minimum lethal dose measures the power of the poison. A
little potassium cyanice will do what will take a lot of arsenic
to achieve.
A statistcian called Finney was trying to assess this minimum
lethal dose for any given poison. His life was a bit complicated
by the fact that not all mice are born equal (at least
not die equal). So the minimum lethal doses may vary a bit
among mice. This is not a major problem: just take the
average. Also the amount of variability might be taken as a
measure of how reliable the poison is.
4 categories with 5 observations in each. Let's run ANOVA:
summary(lm(y~x))
Output:
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.1994 -0.6186 0.0919 0.4028 1.3872
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2063 0.3390 0.608 0.551
x2 -0.4223 0.4794 -0.881 0.391
x3 -0.7095 0.4794 -1.480 0.158
x4 0.1100 0.4794 0.230 0.821
Residual standard error: 0.758 on 16 degrees of freedom
Multiple R-squared: 0.1906, Adjusted R-squared: 0.03878
F-statistic: 1.256 on 3 and 16 DF, p-value: 0.3229
Notice that there is no row for $x_1$ (i.e., category
1). This is because R has taken that as the baseline. Thus, for example,
the reported $x_2$ is actually the contrast $x_2-x_1.$
Now suppose that you are not happy with the default
constrasts. You have your own contrast in mind,
say $x_1-2x_2+x_4.$ You need to specify this to R as
follows. First construct a matrix with the coefficients in a
single column. (If you have multiple contrasts in mind, them
you'll need one column for each.)
my.con = matrix(c(1,-2,0,1))
Give it a name (it is important):
colnames(my.con) = "aha"
Associate this contrast with the categorical variable $x:$
contrasts(x) =my.con
Now perform ANOVA as before:
summary(lm(y~x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.1994 -0.6186 0.0919 0.4028 1.3872
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.04918 0.16949 -0.290 0.775
xaha 0.15911 0.13839 1.150 0.267
x -0.49444 0.33898 -1.459 0.164
x 0.19087 0.33898 0.563 0.581
Residual standard error: 0.758 on 16 degrees of freedom
Multiple R-squared: 0.1906, Adjusted R-squared: 0.03878
F-statistic: 1.256 on 3 and 16 DF, p-value: 0.3229
See the blue line?
Comments
To post an anonymous comment, click on the "Name" field. This
will bring up an option saying "I'd rather post as a guest."