
Three social distance measures for film rankings
Author(s) -
Leazer Gregory H.,
Furner Jonathan,
Napper Rachel
Publication year - 2003
Publication title -
proceedings of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1550-8390
pISSN - 0044-7870
DOI - 10.1002/meet.1450400103
Subject(s) - distance matrix , similarity (geometry) , product (mathematics) , ranking (information retrieval) , matrix (chemical analysis) , shortest path problem , distance matrices in phylogeny , measure (data warehouse) , mathematics , path (computing) , combinatorics , statistics , computer science , geometry , materials science , artificial intelligence , data mining , composite material , graph , image (mathematics) , programming language
We describe the use of three alternative methods for ranking films for information retrieval (IR). A large film‐person incidence matrix is generated using the principle cast, directors, producers and screenwriters for each film. These attributes are used to measure film‐film distances by creating a distance matrix: two films are considered to be adjacent if there is any overlap in the people associated with each film. The distance between any two films is measured by the shortest path used to connect them through their adjacent members. The second and third methods involve the creation of a similarity matrix that expresses the amount of overlap in the people associated with any two films using Dice's coefficient. A “product distance” matrix is then derived that express the distances between any two films based on the product of the similarity weights on a path that connects those films. The highest value is chosen when alternate paths connect the two films. We also describe an “accumulative difference distance” matrix that also expresses the distances among pairs of films. The distance, product distance and accumulative difference distance matrices are used to generate rankings for a random sample of films.