CSCE 410/810
Final
Project Assignment
October 30, 2003
Problem
Define a programming project that implements a
system related to the advance topics of information retrieval such as query
modification, thesaurus construction, ranking algorithms, and clustering. You may also look into routing, question
answering, confusion, interactive search, and other issues in IR.
The goal of this project is to motivate you to build
an IR-related tool, test it, and analyze it.
This assignment counts 25% towards your grade.
Requirements
(1)
Proposal summary. Write a 2-page summary about your proposed final project
and turn it in to me before November
11, 2001. Turn it in as early as possible so I can
approve your proposed project and you can start early on your final project. You proposal must also state what data you
want to use in your final project.
(2) You are required to build a program that
addresses some IR-related issues. There
are three general approaches to this:
(a) build a simple program but perform a rigorous experiment with many
document collections, (b) build a complex program and test it on a simple
document collection, or (c) build a moderately complex program and test it
carefully on some document collections.
(2)
Document Collections.
I have many document or data collections: Topics from TREC conferences,
confusion data, interactive data, routing data, questions and answers data,
Reuters news, etc. Please discuss with
me what kind of data you want for your Final Project. The collections are big and they are not put online. Some data are available at our class handin
account:
/home/grad/Classes/cse410/DATA/
There are documents similar to the
document collections used in your homework assignment #4. You may also download your own document collections
from the web.
If you want some other data, please
come see me and I will provide them.
Hand In
(1) A
comprehensive report that includes: (a)
Introduction: the description of the problem you are addressing, why do you
think it is important, and so on; (b) Design: the description of your
implementation approach, solution strategy, styles/design, and so on; (c)
Results: the experiments, datasets, discussion of results, comparisons with
other literature, and so on; (d) Possible extensions and future work, (e)
Conclusions. In your appendix: (a) the
instructions on how to run your programs, (b) results/output/graphs, (c) the
printout of your programs.
(2) You MUST
make sure that your
programs run on CSE platforms and your instructions on how to run your programs
must be clear. We have only a couple of
days to grade your final project to turn in the final grades. If we cannot run your programs, we will not
have time to contact you to get it to work.
So, please keep this in mind. To
be ABSOLUTELY sure, you may want to turn in your
programs earlier.
(3)
Turn
in your homework electronically using the handin account and turn in a
hardcopy of your report at my office.
The assignment is due 8:00 a.m. December 18, 2003.
The following table specifies the penalties for late homework.
|
Time Turned In |
Penalty |
|
8:00 a.m. (12/18/2003) |
None |
|
Later than 8:00 a.m. (12/18/2003) |
Not accepted |
Grading
The Final Project will be graded in two parts: programming (50%) and report (50%). The programming part will be graded as follows:
(1) 50%
Program Correctness (including the accessibility of the programs)
(2)
10%
Software Design
(3)
10%
Programming Style
(4)
20%
Testing
(5)
10%
Documentation (in-program documentation)
The report will be graded as follows:
(1) 15% Introduction
(2) 20% Design
(3) 30% Results
(4) 10% Possible Extensions and Future Work
(3) 15% Conclusions
(6) 10%
Appendices