Projects
Midsem project: Making a vector map stitching satellite screenshots
Some R tips
Test data sets
Sem project: Shape analysis

Projects

This course has two projects, each carrying 10 marks (plus some bonus). The first project is considered part of the midsem exam, the second project is considered part of the sem exam. The bonus marks get added to the overall aggregate for this course.

Midsem project: Making a vector map stitching satellite screenshots

Consider a region samll enough so that earth's curvature may be ignored, yet large enough so that Google map cannot fit it in a single screen at high resolution.

We cover the region with a number of overlapping screenshots all of the same resolution.

Each rectangle is a screenshot (hence is of the size of your monitor). All the screenshots are numbered. To understand the subsequent steps we focus on two overlapping screenshots (say screenshots 1 and 2):

The red dots are known locations that you can identify at the current resoluton of the screenshot. Note that the central location is part of both the screenshots. For each screenshot find the approx pixel coordinate of the location (e.g., by clicking on the centre of each red disk, and reading the mouse coordinates). For example, we measure $(r_{1j}, c_{1j})$ from screenshot 1:

Here $r$ stands for "row", and $c$ for "column". The first subscript is the screenshot number, the second is the location number (we assume that the central red dot in the $j$-th in the list of the known locations). Similarly, from screenshot 2 we measure $(r_{2j}, c_{2j})$:

Notice that we are using the same $j,$ as it is the same location.

Thus our data set consists of a subset of $(r_{ij}, c_{ij})$'s for $i=1,...,$number of screenshots, and $j=1,...,$number of locations. Of course, not all $(i,j)$ pairs occur in the data, since the $j$-th location may not show up in the $i$-th screenshot.

Let $(\mu_i, \lambda_i)$ be the true position of the $i$-th location (w.r.t. some global coordinate system).

We can set up a linear model to estimate $(\mu_i, \lambda_i)$'s from the data.

The project consists of the following parts:

Working out the theory: this involves setting up the linear model, and working out the rank of the design matrix. [2 marks]
Implementing the entire thing in R: the final software should take a list of screenshots and show them one by one in R, allowing the user to click on the known locations. The system should save the click locations as well as the location identifiers. Then then the system should run the linear model to estimate the true positions. [8 marks]
Bonus: allow screenshots of different (known) resolutions.[ 5 marks]

Some R tips

You may read a jpeg image in R and display it as:

library(jpeg)
x=readJPEG('screenshot1.jpg') #replace screenshot1.jpg with your image name.

plot(1:2,ty='n') #set up the screen as [1,2]x[1,2] or whatever you like.

rasterImage(as.raster(x),1,1,2,2) #draw the image on screen.

You may need to install the jpeg package first.

Similarly, there is a package called png for reading images in the png format.

To allow the user to click at points on an image and to get the coordinates of the clicked points:

p = locator(1) #for one click
p = locator(2) #for two clicks
p = locator() #for any number of clicks (end with a rightclick)

In each of these cases p stores the coordinates of the clicked points (w.r.t. the coord system you set us using the plot command earlier).

Test data sets

I have created a fake map and some screenshots covering it:

Here are the screenshots for the same resolution case, and here for different resolutions case. Each screenshot contains a horizontal line segment that is known to be of the same length. So if you want you may use it as a scale. If you alo want to play with rotated images then try this bunch.

In your final output the triangles must be equilateral. Also, try using only s1, s2, s3 and s8 to see if your program correctly generates an error message.

Sem project: Shape analysis

This project is about shape recognition. After we see many mango leaves, we get an idea about how a typical mango leaf looks. Given a new leaf, we may say with confidence if it is a mango leaf or not. Roughly speaking, we can consider each leaf as a point in a "leaf space". The mango leaves that we were shown all reside in one region in that space. We have to measure the distance of the new leaf (which again a point in the "leaf space") from this region, and say "yes, it is a mango leaf", if this distance is below a certain threshold.

This sounds nice, in principle. But how to define the "leaf space"? How to define the "region of mango leaves"? How to measure distances in this space? How do we accommodate the random variations among leaves?

It turns out that linear mixed effects models have an answer to these. This project will explore that. It is based on the book Mixed Models by Eugene Demidenko.

This project will involve some amount of reading from this book. So let's start by making a list of the reading material:

Section 3.8.1: Membership test: Is a new point like some other given points? In other words, if the given points form a club, then is the new point a member of the club? Read the section to learn how linear models may be used to answer such questions.
Sections 11.1, 11.2: A brief introduction to statistical shape analysis.
Section 11.7.1: Analysis of a star shape: This is a long and somewhat complicated looking section. I suggest that you read the introductory part, and then skip the "Semiparametric model", and jump to the "Example: Leaf analysis" on page 598. Our aim is to reproduce this example (or argue that there is some flaw in it) with some data set of our own.

This project is extremely open-ended. You are encouraged to explore the idea on your own, and possibly adapt it for with other shapes of your choice (geometric, letters, noses, whatever...you get the idea).

If you want you may take a look at the data sets and R codes used in chapter 11 of the book. You are welcome to use these in your project. However, in view of the short amount of time available, I think it advisable to stick to simpler shapes like triangles or quadrilaterals. The aim of the project is to come up with a working example to demonstrate use of linear mixed models for shape analysis.

Comments

To post an anonymous comment, click on the "Name" field. This will bring up an option saying "I'd rather post as a guest."

Table of contents