Term Weighting and Ranking Algorithms
Review
Documents in 3D Space
Vector Space Model
Documents in Vector Space
Vector Space Documentsand Queries
Similarity Measures
Text Clustering
Agglomerative Clustering
AgglomerativeClustering
Automatic Class Assignment
PPT Slide
Today
Finding Out About
Ranking Algorithms
Structure of an IR System
Vector Representation (revisited; see Salton article in Science)
Assigning Weights to Terms
Binary Weights
Raw Term Weights
Assigning Weights
tf x idf
Inverse Document Frequency
tf x idf normalization
Vector space similarity(use the weights to compare the documents)
Vector Space Similarity Measurecombine tf x idf into a similarity measure
To Think About
Computing Similarity Scores
Computing a similarity score
Other Major Ranking Schemes
Probabilistic Models
Probabilistic Models: Some Notation
Logistic Regression
Probabilistic Models: Logistic Regression
Probabilistic Models: Logistic Regression attributes
Simplified Logistic Regression
Vector and Probabilistic Models
Email: ray@sherlock.berkeley.edu
Home Page: http://sims.berkeley.edu/~ray
Download presentation source