Mutivariate differentiation using linear transformation (Fri Aug 30 2019)
A "simple" probability problem (Fri May 03 2019)
Completeness and sufficiency (Fri Apr 26 2019)
Proof of convergence of LR (Mon Jan 28 2019)
A short intro to tensor algebra (Sat Oct 13 2018)
How to use Bengali in LaTeX
Motivation behind completeness in statistics(Thu Oct 12 2017)
Sequential criterion of uniform continuity (Mon Oct 09 2017)
Probit and logistic regression(Thu Sep 14 2017)
Logistic regression in R (Tue Sep 12 2017)
Getting started with R (Wed Aug 16 2017)
Contrasts in R (Wed Aug 16 2017)

Mutivariate differentiation using linear transformation (Fri Aug 30 2019)

Here are some Bengali notes on the topic.

A "simple" probability problem (Fri May 03 2019)

The problem:

You are given $n\in{\mathbb N}$ and $k\in\{0,...,n\}.$ A coin with $P(H)=p$ is tossed $n$ times independently. Let $f(p)$ denote the probability of having at least $k$ heads in a stretch. Show that $f(p)$ must be an nondecreasing function of $p.$

The result is very intuitive. But a formal proof is not. Here is a solution that I got from Prof. B V Rao. Please try to solve the problem yourself first. That way you will relish Prof Rao's solution all the more.

Enough to show that if $p_1 < p_2$ then $f(p_1)\leq f(p_2).$
Let's take $p_1 < p_2.$ Consider a different random experiment with two coins. The first coin has $P(H)=p_1$ and the second coin has $P(H)=x$ (to be determined soon). Toss them in parallel (independently) $n$ times. Create two sequences of $H$'s and $T$'s out of the outcomes. The first sequence consists of the outcomes of the first coin. The second sequence has an $H$ if at least one of the two coins shows a head.
Here are two events:

$A = $ the event that there are at least $k$ consecutive $H$'s in the first sequence.

$B = $ the event that there are at least $k$ consecutive $H$'s in the second sequence.

Clearly, $A\subseteq B.$ So $P(A)\leq P(B).$
Now $P(A) = f(p_1)$ and $P(B) = f(p_1+x-p_1x).$ Just choose $x $ so that $p_1+x-p_1x = p_2$ to complete the proof. Such an $x$ is always possible because $p_1 < p_2.$

Completeness and sufficiency (Fri Apr 26 2019)

Complete sufficien statistics are useful things. These two concepts are traditionally defined in quite diferent terms, and it seems a bit strange at they play so well together. Here is a somewhat less traditional view of these two concepts that bring out a similarlty between the two concepts.

If a statistic is sufficient then so is any one-one function of it. Similarly for a complete statistic. This shows that sufficiency and completeness are actually properties of partitions of the sample space. We shall illustrate in the case where we have just one parameter. Let $P$ be a parition of the sample space. It is called sufficient, if given the block of the partition where your sample is, you know the exact distribution of the sample. Clearly, if I give you any finer partition, then also the property holds. So sufficiency is a property that is preserved under refinement of a partition.

Now for completeness. Again we start with a partiion, $P.$ I give you two numbers for each block of this partition, $a_k$ and $b_k$ for the $k$-th block. Average all the $a_k$'s (weighted avg, weights proportional to the probabilities of the blocks). You'll get a function of the parameter. Do the same for the $b_k$'s. You get another function. If these two functions are the same, then can you say that the $a_k$'s are the same as the $b_k$'s? Not necessarily. However, if the parameter space is large enough (or the number of blocks is small enough), then it might be possible. In this case we say that the partition is complete. Clearly, a courser partiion is also complete.

Thus, a partition being sufficient means it is fine enough, and a partition being complete means it is course enough. A complete, sufficient partition combines both the aspects. This is much like a basis being a linearly independent spannng set. If a set is linearly independent, then so is any of its nonempty subsets. If a set is spanning, then so is any superset. A basis strikes a balance.

Proof of convergence of LR (Mon Jan 28 2019)

See the original paper by Wilks.

A short intro to tensor algebra (Sat Oct 13 2018)

See here.

How to use Bengali in LaTeX

You'll need to use xelatex (which is part of of any LaTeX installation like MikTeX). xelatex has a package called fontenc, which is also part of any standard LaTeX distribution. You'll also need an opentype unicode Bengali font like SolaimanLipi or Ekushe. Some sample files (fonts, latex and resulting pdf) are here.

Motivation behind completeness in statistics(Thu Oct 12 2017)

Here are links to some discussion of this on the web: Stack Exchange (1), Stack Exchange (2) and Notes of Prof Cremling

Sequential criterion of uniform continuity (Mon Oct 09 2017)

The following sequential criterion for uniform continuity is found in many books (including mine), but its proof is almost always left as an exercise:

Let $D\subseteq{\mathbb R}.$ A function $f:D\rightarrow{\mathbb R}$ is uniformly continuous $\Leftrightarrow$ $$ \forall (x_n), (y_n)\subseteq D ( x_n-y_n\rightarrow0 \Rightarrow f(x_n)-f(y_n)\rightarrow0). $$

Here is the proof:

$\Rightarrow$ part: Let $f:D\rightarrow {\mathbb R}$ be uniformly continuous. Take any two sequences $(x_n),$ $(y_n)\subseteq D$ such that $x_n-y_n\rightarrow 0.$ Shall show $f(x_n)-f(y_n)\rightarrow0,$ ie $$ \forall \epsilon>0~~\exists N\in{\mathbb N}~~\forall n\geq N~~ |f(x_n)-f(y_n)| < \epsilon. $$ Take any $\epsilon>0.$

Since $f$ is uniformly continuous, hence $$\exists \delta>0~~\forall x,y (|x-y|< \delta\Rightarrow |f(x)-f(y)|<\epsilon).$$ Again, since $x_n-y_n\rightarrow0,$ $$ \exists N\in{\mathbb N}~~\forall n\geq N~~|x_n-y_n|<\delta. $$ Choose this $N.$

Take any $n\geq N.$

Then $|x_n-y_n|<\delta,$ and so $|f(x_n)-f(y_n)|<\epsilon,$ as required.

$\Leftarrow$ part: To show that $f$ is uniformly continuous, ie, $$\forall \epsilon>0 \exists \delta>0~~\forall x,y (|x-y|< \delta\Rightarrow |f(x)-f(y)|<\epsilon).$$ Let, if possible, this be false. Then its negation must be true: $$\exists \epsilon>0 \forall \delta>0~~\exists x,y ~~|x-y|< \delta \mbox{ and } |f(x)-f(y)|\geq \epsilon.$$ Keep this $\epsilon $ fixed, and put $\delta = \frac1n$ for different $n\in{\mathbb N}.$ For each $n$ you'll get $x$ and $y.$ Call them $x_n$ and $y_n.$

Then you get two sequences $(x_n),$ $(y_n)$ such that $|x_n-y_n|<\frac1n$ and $|f(x_n)-f(y_n)|\geq \epsilon>0.$

Thus, $x_n-y_n\rightarrow0$ but $|f(x_n)-f(y_n)|\not\rightarrow0,$ contradicting the given condition.

Probit and logistic regression(Thu Sep 14 2017)

Let's start with probit (logistic is similar).

All statistical tools are born out of the need to cope of some kind of data analysis requirement. Let's learn the requirement behind probit.

If you give a large enough does of poison to a mouse, it will die. The minimum lethal dose measures the power of the poison. A little potassium cyanice will do what will take a lot of arsenic to achieve.

A statistcian called Finney was trying to assess this minimum lethal dose for any given poison. His life was a bit complicated by the fact that not all mice are born equal (at least not die equal). So the minimum lethal doses may vary a bit among mice. This is not a major problem: just take the average. Also the amount of variability might be taken as a measure of how reliable the poison is.

Logistic regression in R (Tue Sep 12 2017)

Here is a good starting point: https://stats.idre.ucla.edu/r/dae/logit-regression/

Getting started with R (Wed Aug 16 2017)

You may try my tutorial.

Contrasts in R (Wed Aug 16 2017)

Let's create a data set

y = rnorm(20)
x = factor(rep(1:4,5))

4 categories with 5 observations in each. Let's run ANOVA:

summary(lm(y~x))

Output:

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.1994 -0.6186  0.0919  0.4028  1.3872 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.2063     0.3390   0.608    0.551
x2           -0.4223     0.4794  -0.881    0.391
x3           -0.7095     0.4794  -1.480    0.158
x4            0.1100     0.4794   0.230    0.821

Residual standard error: 0.758 on 16 degrees of freedom
Multiple R-squared:  0.1906,	Adjusted R-squared:  0.03878 
F-statistic: 1.256 on 3 and 16 DF,  p-value: 0.3229

Notice that there is no row for $x_1$ (i.e., category 1). This is because R has taken that as the baseline. Thus, for example, the reported $x_2$ is actually the contrast $x_2-x_1.$

Now suppose that you are not happy with the default constrasts. You have your own contrast in mind, say $x_1-2x_2+x_4.$ You need to specify this to R as follows. First construct a matrix with the coefficients in a single column. (If you have multiple contrasts in mind, them you'll need one column for each.)

my.con = matrix(c(1,-2,0,1))

Give it a name (it is important):

colnames(my.con) = "aha"

Associate this contrast with the categorical variable $x:$

contrasts(x) =my.con

Now perform ANOVA as before:

summary(lm(y~x))


Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.1994 -0.6186  0.0919  0.4028  1.3872 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.04918    0.16949  -0.290    0.775
xaha         0.15911    0.13839   1.150    0.267
x           -0.49444    0.33898  -1.459    0.164
x            0.19087    0.33898   0.563    0.581

Residual standard error: 0.758 on 16 degrees of freedom
Multiple R-squared:  0.1906,	Adjusted R-squared:  0.03878 
F-statistic: 1.256 on 3 and 16 DF,  p-value: 0.3229

See the blue line?

Comments

To post an anonymous comment, click on the "Name" field. This will bring up an option saying "I'd rather post as a guest."

Table of contents

Comments