Programs

Courses

Textbooks by Semester

Course Web Pages - Fall 2012 - LIBR 202-13 Greensheet - Assignments

LIBR 202
Information Retrieval
Assignments

Dr. Geoffrey Z. Liu
E-mail


Course Links
Course Calendar
Assignments
Bibliography
LIBR 202 Resources
Online Resource
Supplemental Readings
Additional Resources
Resources
D2L
D2L Tutorial
SLIS e-Bookstore

Group Exercise: [Part A] [Part B]
Individual Project of System Review | Term Paper

Group Exercise

The group exercise will be done using the text-based database management system named InMagic DB/Textworks. The software is available for students to use on any Windows machine. You may download the software from the School's webserver. Unfortunately, this is a Windows/PC softwarethat cannot run directly on a Mac. If you have a Mac computer, you will need to install a midware to emulate PC environment in order to use it. See this page for instruction on how to install and use InMagic DB/Textworks on Mac OS X.

This group exercise  includes two parts. In Part A, each group will develop an attribute scheme and associated rules and design a "pseudo-relational" database using the InMagic software. In Part B, each group will evaluate an anomynous peer group's work from Part A. Assignment of peer groups and exchange/distribution of related materials will be handled by the instructor.

Each group should have one member elected as the group leader for coordinating group activities, assembling submission packets, and handling distribution of materials.

Part A: Database Design (10%)

In Part A, your group will design a database for searching a collection of entities of your own choice. You are required to select a collection of non-traditional items, i.e., nothing with a title or an author in the traditional sense. For example, a Barbie Doll collection would be a good one for this exercise.  The collection can be hypothetical, and you don't need to literarily and technically have such a collection.

In addition to designing the database, you will write up the rules or standards of description. (Rules are the explicitly stated procedures for implementing and applying the standards.) These rules should consist of a set of standards or specifications for making the records, the fields, and the data in those fields uniform for retrieval. It is frequently desirable to have such standardization because it allows for: (1) some stability for the information system, so that it can be expanded and modified in a sensible, predictable way, and (2) less ambiguous communication of the information in the database.

This group exercise is essentially a thinking practice. The point is about learning about attribute identification/extraction and entity-based IR. Although a number of records (say 20) need to be created according to your design rules and entered into your InMagic database, the point is not about how many records you have to enter.

Procedure

  1. Come to some conclusions about the information needs of the individuals (user group) who will be using your database to find materials before you begin thinking of the rules and the structure of the database. Your design and standards will be affected by the conclusions you reach concerning the group of users and their information needs. This analysis will lead you to considering the purpose(s) of the database and particularly, what kinds of queries can be served by the database. Actually write down the kinds of questions that the users will want to ask. Then determine which attributes of the entities in your collection are needed to (a) serve the user need and (b) differentiate one entity from another
  2. Establish the necessary standards in terms of the content and structure of the records. Standards can apply to: the information unit each record describes; the key field(s) or unique identifier for the record (that is, what distinguishes one record in the database from another unambiguously); the fields needed for the attributes you have identified; which fields are mandatory (must contain data) and which are not; which fields are repeatable; and so on. Express each standard for the record as a rule. That is, write down what someone would need to know to implement the standard when creating records for your database.
  3. For each field in the record, establish the necessary standards in terms of the structure and content of the field. Standards for fields can apply to: names for the fields; fixed or variable length fields (formatted or unformatted); text or numeric data; the form the data will take when entered; allowable values for a field; how to resolve uncertain instances; and so on. Express each standard for the data elements as a rule, and express them as specifically as you can. Ask yourself whether someone unfamiliar with the process you went through could generate the same quality records. Then give them the instructions they need to meet your standards.
  4. Use the rules to guide you in creating your own InMagic database for the collection. Be sure to make a printout of your records after they are entered.

Final Product

At the conclusion of Part A, your group will submit TWO packets (compressed folders) via the Angel digital drop box.

Packet #1 to include the following items:

  1. A statement of purpose of the database. 
  2. A description of your intended user group and discussion of their information needs; include typical questions they may ask of your database system for answers.
  3. An analysis of your unit of description and your record structure and how you arrived at these in light of your user group and their information need.
  4. The set of rules you developed for purposes of describing the collection.
  5. A printout of your database design with sample records.

Packet #2 to include all related files for your InMagic Textbase (database) implementation, i.e., all InMagic files starting with the same "filename" that you gave to your textbase. See this tutorial page on how to bundle and upload InMagic database files for submission and/or sharing.

Part B: Evaluation (10%)

In Part B, the major task is to review a peer group's database design and rules and to propose necessary revision of your own based on review comments. Upon completion of Part A, you will be assigned an anonymous peer group. You and your peer group will review each other's database design and entity cataloging rules. You will write up a review summary to provide your peer group with feedback and suggestions for improvement. After receiving the review summary from your peer group, you will write a response report to address raised issues and discuss how to revise your database design for improvement. The exchange of packets of Part A and review summaries will be handled by the instructor.

Procedure:

Part B1

  1. Critically examine the peer group's work and offer feedback and suggestions. Database design and entity-cataloging rules should be reviewed in light of the following questions:
    • Is the database design appropriate for the collection?
    • How well do they provide for consistent description of items?
    • Is the design sufficient to meet the users' needs?
    • how well do they accommodate exceptions?
  2. Prepare and submit a brief review summary to report your findings as a whole group.

Part B2

  1. Study the review comments from your peer group about your own work, and prepare a response report to address raised issues and to discuss how you may want to revise the database design for improvement. You only need to talk about possible revision and improvement, and you do not need to actually implement it.
  2. Assemble and submit a packet of materials for Part B.

Final Product

At the conclusion of Part B, your group as a whole will turn in a packet of the following items:

  1. A response report;
  2. A brief discussion of what you have learned from this group exercise.
  3. Your peer group's review summary as appendix;

Individual Project of System Evaluation

For this project, you are to conduct an evaluative review of an information retrieval system of your choice and submit a written report.

Choice of System

The system can be a web search engine (for locating resources on the web), a library's web portal of OPAC system, or a full-text database, but it cannot be any "people search", real estate search, or e-commerce site such as Amazon.com or eBay.com. The International Search Engines website link goes to non-SJSU web site and the Wikipedia's list of search engines link goes to non-SJSU web site give you tons of search systems to choose for this project.

Check Points

Specifically, you are to investigate the following major aspects when evaluating the system of your choice:

Final Report

The final report of this individual project may include the following components:

  1. Title page
  2. Abstract
  3. System overview and background
  4. Collection scale
  5. Analysis of potential indexing mechanisms
  6. Description and analysis of search features/functions
  7. Assessment of search effectiveness
  8. Quality of interface design
  9. Conclusion
  10. References
  11. Appendices

The report should be no more than 10 pages excluding appendices and references. The body of text may include tables and diagrams , but definitely not screen shots. (However, a small number of screen shots may be included in appendix if necessary.) Needless to say that your writing needs to follow the same standard and editorial guideline as for the term paper.

To give you an idea of what a good (Grade "A") report of individual project is like, a sample of outstanding work from past semester is provided here in PDF format for your reference.

Term Paper

Click on this link for instruction on term paper.

BlogsCommunity Profiles   | Databases  | eBookstore  | Maps  | PhD  | Second Life |