CSCE 235

Handout 1: Mathematics and CS Applications

January 12, 2004 

 

1.         Image Processing

 

One of the key tasks in Image Processing is edge detection.  That is, given an image of objects, find the edges of the objects.  The usual approach is to look at the intensities of the objects and the background.  For example, if the background is dark, and the objects are bright, then we identify an edge as a transition from dark to bright, or from bright to dark.  Now, when we deal with an image, we have some complications.  First, images are noisy.  The edge is sometimes not clear-cut.  Second, digital images are discrete.  A pixel on an image has a discrete value, and its neighboring pixels also have discrete values.  So, how do we define and subsequently find an edge?

 

There are many edge detection techniques, ranging from simple to highly complex.  Here we look at only one: Laplacian (Marr 1982).

 

The Laplacian of a 2-D function  is a 2nd-order derivative defined as

 

.

 

Now, how are we going to implement this to deal with discrete pixels?

 

One way is to do this:  Assume a mask of  pixels:

 

0

-1

0

-1

4

-1

0

-1

0

 

Given a pixel at , surrounded by the following neighboring pixels:

 

 

Then we have the following approximation:

 

.

 

The basic requirement for the approximation is that the coefficient associated with center pixel be positive and the coefficients associated with the outer pixels be negative.  Because the Laplacian is a derivative, the sum of the coefficients has to be zero.  Hence, the response is zero whenever the point in question and its neighbors have the same value.

 

How do we use this for edge detection?

 

Suppose we have a pixel with its neighboring pixels like this (intensities):

 

5

5

0

5

5

0

0

0

0

 

Then, the Laplacian value for that pixel (centered in the middle) is:   = 4(5) – (5 + 5 + 0 + 0) = 10.  Thus, we can say we have detected a high-Laplacian pixel, an edge pixel!

 

To find all edge pixels in an image, we simply move the Laplacian mask over all pixels in the image, and select high-Laplacian pixels as edge pixels.

 

2.         Information Retrieval

 

In information retrieval (IR), we have a group of algorithms for relevance feedback.  That is, the IR retrieves a set of documents based on a set of user-entered search/keywords, and then refines that set of documents repeatedly.  The refinement may increase the number of documents retrieved.  And after retrieving a certain number of documents, the retrieval completes.  Now, the idea here is to retrieve relevant documents and do not retrieve non-relevant documents.  And also, we retrieve using the keywords registered in the database of documents.  Usually, keywords are arranged into a weight vector.  Suppose I have a list of keywords for my database:  computer, nebraska, CS, college, and Lincoln.  Then, the user enters two keywords:  CS and college.   So the query consists of “CS” and “college”.  Now, the weight vector of that query is simply:  <0 0 1 1 0>. 

 

Similarly, each document will have one such weight vector, where the weights are the frequency of occurrences of the keywords.  So, for document 1, we may have <0.2 0 0.3 0.0 0.5>. 

 

Rocchio (1971) published an algorithm on refining the weight vector of the query based on the retrieved documents’ weight vectors.

 

 

 

where

 

 = the weight vector for the initial query

* = the refined weight vector for the query

 = the weight vector for relevant document i

 = the weight vector for nonrelevant document i

 = the number of relevant documents

 = the number of nonrelevant documents

 

What does this mean?  That means that we refine the query by adding weights to the keywords found in the relevant documents and subtracting weights from the keywords found in nonrelevant documents.  If a keyword is found in both sets of documents, then that keyword’s weight should become zero.  If a keyword is found only in relevant documents, then that keyword’s weight should go up.  If a keyword is found only in nonrelevant documents, then it should go down.  So hopefully, after several iterations, the weight vector of the query will become adaptive to the documents found in the database.

 

3.         References

 

Marr, D. and E. Hildreth (1980).  Theory of Edge Detection, Proc. Royal Society of London, B207:187-217.

Rocchio, J. J. (1971).  Relevance Feedback in Information Retrieval, in Salton, G. (Ed.), The SMART Retrieval System, Englewood Cliffs, NJ: Prentice Hall, 313-323.