EXAMPLE: Consider the 1-way ANOVA model: $y_{ij} = \mu + \alpha_i + \epsilon_{ij}.$
Suppose that I give you some least squares solution $\h \mu$ and $\h \alpha_i$'s. Now the intuitive thinking goes like this:Since $y_{ij}\approx \mu + \alpha_i,$ we may think as if $\mu + \alpha_i$'s are being "watched" by $y_{ij}$'s. If any of the $\mu + \alpha_i$'s change, then it would ring an alarm bell.But it is quite possible that we can tweak $\mu $ and $\alpha_i$'s so that $\mu + \alpha_i$'s never change, then that would give us a new least squares solution. For instance, add 5 to $\mu,$ and adjust by subtracting 5 from all the $\alpha_i$'s. This shows that $X$ is not full column rank.
Here is another example.
EXAMPLE:
Again we consider a 1-way ANOVA model: $y_{ij} = \mu_i +
\epsilon_{ij}.$ Here the $\mu_i$'s are "watched". So
can't do any tweaking without getting detected. Hence the design
matrix is full column rank here.

EXAMPLE: Again consider the model: $y_{ij} = \mu + \alpha_i + \epsilon_{ij}$ for $i=1,...,p,$ say. The range of $j$ does not really matter for finding $r(X)$. (Why?)
There are $p+1$ columns in $X.$ We have already seen that $X$ is not full column rank. Hence $r(X) < p+1.$ To guess $r(X)$ we shall again play the "tweak parameters without setting off the alarm" game. But this time we shall impose an extra constraint: pick any parameter (just any!), say $\mu,$ and never tweak it. Now you'll see that no tweaking is possible. Since you can tweak neither $\mu+\alpha_i$ nor $\mu $, hence you cannot tweak $\alpha_i$ either. Thus, just one constraint is enough to prevent tweaking. The conclusion is: $r(X)$ is exactly one less than the number of columns.
Here is a more complicated example.
EXAMPLE: The 2-way ANOVA model without interaction:$y_{ij} = \mu+\alpha_i+\beta_j+\epsilon_{ij}.$
Here the "watched" quantities are $\mu+\alpha_i+\beta_j.$ Clearly, we can add something to $\mu$ and adjust by subtracting that amount from all the $\alpha_i$'s (or all the the $\beta_j$'s). So not full column rank. To guess the exact rank, let's impose an additional constraint: "Thou shalt not tweak $\mu$." Still we can manage to tweak the $\alpha_i$'s and $\beta_j$'s without letting off the alarm bell. For instance, add 5 to all the $\alpha_i$'s and subtract the same amount from all the $\beta_j$'s. OK, pick any other parameter that is not already fixed by earlier constraints (say $\alpha_1$) and impose a new constraint: "Thou shalt not tweak $\alpha_1$ either." Now, $\mu $ and $\alpha_1$ both being fixed, and $\mu+\alpha_1+\beta_j$'s being watched, we cannot tweak any of the $\beta_j$'s. So none of the other $\alpha_i$'s can be tweaked either. Hence no tweaking at all! And we needed just two constraints. Conclusion: $r(X)$ is two less than the number of columns.
EXAMPLE: Consider the 1-way ANOVA model once again: $y_{ij} = \mu+\alpha_i+\epsilon_{ij}.$
Here is one possible scenario where it could be used. We have three different fertilisers None, Compost and NPK. We want to see their effect on the yield of paddy. Here the constraint $\alpha_1 = 0$ is a suitable one, since None is like a reference case. With this constraint the remaining parameters have the following interpretation:
Each such constraint is effectively choosing a basis
of $\col(X)$ leading to a unique least squares
solution. Each software has its favourite constraint, which
may not be the natural one for a given context. But it is easy to
convert one least squares solution to another that satisfies a
natural set of constraints. The next example illustrates this.
EXAMPLE: Consider the 1-way ANOVA model $y_{ij} = \mu + \alpha_i + \epsilon_{ij}.$ for $i=1,2,3$ and $j=1,...,10.$
R uses the constraint $\alpha_1 = 0.$ However, we want the constraint $\sum \alpha_i = 0.$ If the estimates produced by R are $$ \h \mu = 23.4, \quad \h \alpha_1 = 0,\quad \h \alpha_2 = 45.9,\quad \h \alpha_3 = -3.4, $$ then find the estimates that satisfies our constraint. SOLUTION: Just average the $\h \alpha_i$'s and subtract this from all the $\h \alpha_i$'s. Adjust by adding the same quantity to $\h \mu.$ Notice that you really do not need to know what constraint(s) R uses internally in order to impose your set of constraints.