CSCE 410/810
Homework
Assignment 1
September 2, 2003
Problem
Use your favorite search engines to download
electronic copies of journal/conference/book articles in the area of
information retrieval. That is, you are
required to not only retrieve a list of such articles but also the actual files
of the articles in postscript (.ps) format or PDF.
Some useful search engines include www.profusion.com
(general), www.google.com (general
scientific), dblp.uni-trier.de (database), citeseer.nj.nec.com/cs (computer
science), and hpsearch.uni-trier.de (homepage search of researchers). You are free to use other search engines.
The articles of interest should be related to
information retrieval: indexing, retrieval, routing, query manipulation,
lexical processing, and other areas.
Some useful keywords include “information
retrieval,” “document indexing,” “retrieval evaluation,” “inverted files,” and
so on. Other useful keywords include
the terms and definitions that we have covered in the class.
The goal of this exercise is to familiarize yourself
with searching for general papers for downloads (for a background search
exercise) and searching for the complete reference of a specific paper (for a
target search exercise) on the Web and the related issues, preferences, and
problems.
This assignment counts 5% towards your grade.
Requirements
Exercise 1 Background Search
You are required to download 10 postscript or PDF
journal/conference/book articles in the area of information retrieval. You are required to describe how you obtain
each article. Depending on your search
effort, you may come upon a useful repository of articles and be able to obtain
all 10 articles from that single site; sometimes you may have to search different
sites for the 10 articles. Document
your experience. For example: Which
search engine did you start with? Which keywords did you use? Which site did you start with? Which links did you use to go from the first
site to the second, the second to the third, and so on, to the final
destination where you obtained the desired articles? How many broken links did you encounter? How many articles you found did not come
with electronic copies? How many
articles you found only had electronic abstracts? Which sites were useful?
How many dead-ends did you come to?
Describe your search strategy.
For example, did you start with general search terms such as
“information retrieval” and focus your search with more specific terms such as
“inverted files”? Did you try to find
the citation of the articles first using search engine A and then locate the
electronic copies of the articles using site B?
Also, you are required to report the time you spent
on getting the 10 articles: how many minutes did you actually spend in
searching before finally having all 10 articles in your account? Moreover, report the number of articles that
you downloaded in the first 15 minutes, the first 30 minutes, the first 45
minutes, and so on (in 15-minute increments).
Note that how much time you spent will not affect the grade of this
homework, so you are encouraged to be truthful in your reporting.
Exercise 2 Target Search
You are required to obtain the complete reference of
the following journal paper:
“Information
Retrieval and Artificial Intelligence”
published in 1999.
That means you need to find out the list of authors, the name of the
journal, the volume and issue of the journal, and the page numbers of the
journal where the paper appeared.
Once again, you are required to describe how you
obtain the complete reference for the above paper. Document your experience.
For example: Which search engine did you start with? Which keywords did
you use? Which site did you start
with? Which links did you use to go
from the first site to the second, the second to the third, and so on, to the
final destination where you obtained the desired information? How many broken links did you
encounter? Which sites were
useful? How many dead-ends did you come
to? Describe your search strategy. For example, did you start with general
search terms such as “information retrieval” and focus your search with more
specific terms such as “artificial intelligence”?
Also, you are required to report the time you spent
on getting the complete reference. Note
that how much time you spent will not affect the grade of this homework, so you
are encouraged to be truthful in your reporting.
NOTE: You may not be able to obtain the complete
reference of the above paper. However,
it is important to document your search process (strategies and tactics).
Hand In
(1) A report that documents your experience,
including details as described above and whatever other details that you think
are useful. This report should include
(a) A list of articles that you have
downloaded:
a.
For
journal papers: Author(s), Year, Title, Publication, Volume, and Page Numbers.
b.
For
conference proceedings papers: Author(s), Year, Title, Conference Title, Dates,
Place, and Page Numbers.
c.
For
book chapters: Author(s), Year, Title, Book Title, Author(s) or Editor(s) of
the Book, Page Numbers, and Publisher.
(b)
A
conclusion that describes your search strategy and your thoughts on the
importance of keywords, roles of the search engines, usefulness of the sites,
and other insights that you might have learned, for both Exercises.
IMPORTANT: Hand in
the postscripts or PDF files of the above articles to the CSE class handin
account.
The assignment is due 9:30 a.m. September 9, 2003 in
the beginning of the class. The
following table specifies the penalties for late homework.
|
Time Turned In |
Penalty |
|
9:30 a.m. – 9:35 a.m. (9/9/2003) |
None |
|
9:35 a.m. – 10:45 a.m. (9/9/2003) |
Lose 10% |
|
10:45 a.m. – 5:00 p.m. (9/9/2003) |
Lose 20% |
|
Later than 5:00 p.m. (9/9/2003) |
Not accepted |
Grading
(1) 40%
on your report
(2) 15% on the completeness
of the list of articles (including the information for each article listed
above) (Exercise 1)
(3) 5% on the relevance of your articles
(Exercise 1)
(4) 5% on the availability of your account
directory where the files are stored (Exercise 1)
(5) 10% on the completeness
of the reference for the specific journal paper (Exercise 2)
(5) 25% on your conclusion