340 likes | 367 Views
Explore similarity approaches, including Geometric Models, MDS, and Contrast Models in geographic applications, studying how human assessments differ from geometric models.
E N D
Semantic Similarity Measurement and Geographic ApplicationsSimilarity approaches Dr. Martin Raubal Department of Geography, UCSB raubal@geog.ucsb.edu geog 288MR, spring 08
Approaches • Geometric Model / MDS • Gärdenfors: Conceptual Spaces • Feature-based Model • Tversky: Contrast Model • Rodriguez: MDSM • Alignment-based Model • Goldstone: SIAM • Transformational Model • Hahn, Example.: ABBA AABB geog 288MR, spring 08
Geometric models and MDS • Multidimensional scaling (MDS) => similarity between entities as geometric models consisting of points in dimensional metric space. • Similarity inversely related to distance (dissimilarity) between two entities => linear decaying function of the semantic distance d. geog 288MR, spring 08
Geometric models and MDS cont. • n … number of dimensions • xik and xjk … values for dimension k of the entities i and j • Minkowski metric: r = 1 => city-block metric, r = 2 => Euclidean metric, etc. geog 288MR, spring 08
MDS in cognitive science • Applied to discover mental representations of stimuli and explanations of similarity judgments. • MDS as mathematical model of categorization, identification, recognition, memory, generalization (Nosofsky 92, Shepard 87). • Degree of relation between stimuli ~ spatial distance geog 288MR, spring 08
Representational model geog 288MR, spring 08
Geometric models and MDS cont. • Choice for metric to best fit human similarity assessments => depends on entities (stimuli) and subjects’ strategies. • Euclidean metric provides better fit to empirical data when stimuli are composed of integral, perceptually fused dimensions (e.g., brightness and saturation of color). • City-block metric appropriate for psychologically separated dimensions (e.g., color and shape). geog 288MR, spring 08
Euclidean metric City-block metric geog 288MR, spring 08
color shape geog 288MR, spring 08
MDS vs. Geometric models • MDS determines number of dimensions from subjects‘ pairwise judgments. • Goal: maximum correlation between judgments and distances in n-dim. space with minimum number of dimensions. • Geometric models start with defining dimensions. geog 288MR, spring 08
Axioms of geometric model • Minimality: • Symmetry: • Triangle Inequality: These axioms may not hold for human similarity assessments! geog 288MR, spring 08
Problems with geometrical model • Distance between compared entities is not symmetric but asymmetric (Tversky 1977). Example: North Korea is judged to be more similar to Red China than vice versa. • Category members are judged more similar to category prototypes than prototype to several category members. geog 288MR, spring 08
Problems with geometrical model • A lamp is similar to the moon (light);moon similar to soccer ball (shape); lamp NOT similar to soccer ball (?);(James 1892) • Adding common features to entities does not increase their similarity (distance grows). geog 288MR, spring 08
Requirements and assumptions • Independence of properties. • Property set must reflect human conceptualization to provide good similarity results – how to achieve this? • Comparability of different dimensions – same relative unit. geog 288MR, spring 08
Feature-based models Common elements approach • Two entities (stimuli) are similar if they have common features (elements). • The more elements they share, the more similar the stimuli are. • Problem: always possible to find endless amount of common elements depending on the view. geog 288MR, spring 08
Representational model • Set-theoretic: concepts represented as unstructured sets of features. • Characterization through properties common in analysis of cognitive processes. • Application areas: speech perception, pattern recognition, perceptual learning. geog 288MR, spring 08
[Schwering 2008] geog 288MR, spring 08
Feature-matching model • Proposed by Amos Tversky.A. Tversky (1977) Features of Similarity. Psychological Review 84(4): 327-352. • Supports asymmetric similarity measurement. • Elementary set operations can be applied to estimate similarities and differences. geog 288MR, spring 08
Requirements and assumptions • Independence of features. • Feature set must be sufficiently rich to account for human categorization. • Invariance of representational elements (no transformations as in geometric models). geog 288MR, spring 08
Feature-based models cont. Contrast model • Similarity is defined not only by the entities’ common features, but also by their distinctive features (Tversky 1977). • In contrast to the common elements approach a flexible weighting is used. geog 288MR, spring 08
Contrast model • q, a, b … weights for common / distinctive features • (AB) … number of features that A and B have in common • (A-B) … features possessed by A but not B • (B-A) … features possessed by B but not A Asymmetric because a is not constrained to be equal to b nor f(A-B) to f(B-A). geog 288MR, spring 08
Ratio model • Similarity is normalized => S between 0 and 1. geog 288MR, spring 08
Assertions • Similarity measurement is directional and asymmetric. • Model used to test Rosch‘s (1978) hypothesis that perceived distance from prototype to variant is larger than perceived distance from variant to prototype. geog 288MR, spring 08
Matching-Distance Similarity Measure • Matching-Distance Similarity Measure (MDSM): context sensitive, asymmetric semantic similarity measurement approach for geographic entity classes (Rodríguez and Egenhofer 2004). • Based on Tversky‘s contrast model. • Different kinds of features: Features are classified by types (parts, functions, attributes). geog 288MR, spring 08
Discussion • Information retrieval: Descriptions of query and data source concepts may differ greatly in their granularity - query concepts often focus on the very characteristic properties, data source concepts are described broadly to be context-independent. • Query ‘flooding area’ (shape, relation to waterbodies) vs. data source ‘floodplain’ (additional hydrologic & ecologic properties) => distinct properties reduce similarity! geog 288MR, spring 08
Problems with feature-based models • Features, dimensions are unrelated, but in reality entities are not simply unstructured bags of features. • Also true for relations between entities! geog 288MR, spring 08
Network Model – Semantic Network Quillian’s Semantic Network: • organized as hierarchy • concepts inherit features • experiments with response time • graph distance assimilarity measure Collins, Quillian (1969)
Alignment-based models • Use commonalities and differences as notion of similarity, but include also relational structure of properties. • Motivation: Similarity is like Analogy. • Similarity involves structural alignment and mapping. geog 288MR, spring 08
Two spatial scenes are described by a set of features. The similarity between these scenes depends on the correct alignment of these features [Gentner et al. 1995, p. 114] geog 288MR, spring 08
Transformational model • Transformations required to make one concept equal to another are defined. • Similarity depends on number of transformations needed to make concepts transformationally equal. • Example: Operations modifying the geometric arrangement are rotation, reflection, translation and dilation. geog 288MR, spring 08
Transformational model • Similarity assumed to decrease monotonically when number of transformations increases. • Transformational model is asymmetric, but the metric axioms minimality and triangle inequality hold. geog 288MR, spring 08