IS2140 Hanying Huang: February 2014

Friday, February 28, 2014

Unit 8 Reading Notes

MIR Ch10

· This reading discusses user interfaces for communication between human information seekers and information retrieval systems. The well-designed interface would significantly improve the user experience and the search performance.

· Principles for design of user interfaces: provide informative feedback, permit easy reversal of actions, support an internal locus of control, reduce working memory load, and provide alternative interfaces for novice and expert users.

· Information visualization provides visual depictions of very large information spaces. Main techniques: icons, color highlighting, brushing and linking, panning and zooming, focus-plus-context, magic lenses, and animation.

· The article introduces four kinds of starting points which should be provided by the search interfaces: lists, overviews, examples and automated source selection.

· There are five primary human-computer interaction styles: command language, form fill in, menu selection, direct manipulation and natural language. Each technique has been used in query specification interfaces and each has advantages and disadvantages.

· It also explains how to show the relationship of the document set to query terms, collection overviews， descriptive metadata, hyperlink structure, document structure, and to other documents within the set (via context), in order to make the document set more understandable.

· Relevance feedback is an effective technique used for query reformulation. A standard interface for relevance feedback consists of a list of titles with checkboxes beside the titles that allow the user to mark relevant documents.

Reading: Search User Interfaces

· The web search interfaces always keep simple and unchanging for the following reasons:

n Search is a means towards some other end, rather than a goal in itself.

n Search is a mentally intensive task.

n The interface design must be understandable and appealing to a wide variety of users of all ages, cultures and backgrounds, applied to an enormous variety of information needs.

· An important quality of a user interface (UI) is its usability which includes five basic components: learnability, efficiency, memorability, errors, and satisfication.

· Eight design desiderata for search user interfaces generally:

· Offer informative feedback.

· Support user control.

· Reduce short term memory load.

· Provide shortcuts for skilled users.

· Reduce errors; Offer simple error handling.

· Strive for consistency.

· Permit easy reversal of actions.

· Design for closure.

Thursday, February 27, 2014

Unit 7 Muddiest Point

In slide 40, it talks about the utility - quality, novelty, importance, credibility and many other features of the documents to the user’s need. However, in the IR system, how are the documents evaluated based on utility? If it can be only evaluated manually, is it valuable to spend much time on the evaluation taking the improvement of the search results into consideration?

Friday, February 21, 2014

Unit 7 Reading Notes

IIR Chapter 9

· The Relevance Feedback is the idea that the system involves the user’s feedback to refine the searching results.

· Algorithms for implementing relevance feedback.

n Rocchio Algorithm: incorporating relevance feedback information into the vector space model.

n Naive Bayes probabilistic model

· Relevance feedback can improve both recall (more effective) and precision.

· Requirements for effective relevance feedback.

n The user has to have sufficient knowledge to be able to make an initial query.

n Relevant documents to be similar to each other.

· Evaluating the effectiveness of relevance feedback

n Start with an initial query q0 and to compute a precision-recall graph.

n Use documents in the residual collection for the second round of evaluation.

· Pseudo relevance feedback automates the manual part of relevance feedback, so that the user gets improved retrieval performance with- out an extended interaction.

· Indirect relevance feed back uses indirect sources of evidence.

· Implicit feedback is less reliable than explicit feedback, but is more useful than pseudo relevance feedback.

· Three global methods for expanding a query: by simply aiding the user in doing so, by using a manual thesaurus, and through building a thesaurus automatically.

Reading: Improving the Effectiveness of Information Retrieval with Local Context Analysis

· This paper proposes a new technique for automatic query expansion, called local context analysis, which selects expansion terms based on co-occurrence with the query terms within the top-ranked documents.

· Existing techniques for automatic query expansion can be categorized as either global or local.

· Local context analysis is a local technique, but it employs co-occurrence analysis, a primary tool for global techniques, for query expansion.

· The metrics used by local context analysis for concept selection: co-occurrence metric, combining the degrees of co-occurrence with all query terms, differentiating rare and common query terms

· Experimental results on a number of collections show that local context analysis is more effective than existing techniques.

Reading: A Study of Methods for Negative Relevance Feedback

· This paper focuses on the analysis of negative relevance feedback. The Experiment results on several TREC collections show that language model based negative feedback methods are generally more effective than those based on vector-space models, and using multiple negative models is an effective heuristic for negative feedback.

· General strategies with some variations for negative feedback: (1) SingleQuery: query modification strategy; (2) SingleNeg: score combination with a single negative query model; (3) MultiNeg: score combination with multiple negative query models.

· Two heuristics to increase the robustness of using negative feedback information: Local Neighborhood and Global Neighborhood.

Thursday, February 20, 2014

Unit 6 Muddiest Point

The topic includes title, description and narrative.

How can we parse the topic into queries? Remove the stop words? Any further processing?