This is an example of a bad talk

This is an example of a bad talk (Disclaimer: The paper that should have been presented in this talk is a classic in the field, a great paper: this talk, not the paper, is rotten).

On the Foundations of Relaxation Labeling Processes By An Anonymous Student

Overview • Motivation • I. Introduction to Labeling Problems • II. Continuous Relaxation Labeling Processes • III. Consistency • IV. Overview of Results • V. Average Local Consistency • VI. Geometric Structure of Assignment Space • VII. Maximizing Average Local Consistency • VIII. The Relaxation Labeling Algorithm • IX. A Local Convergence Result • X. Generalizations to Higher Order Compatibilities • XI. Comparisons with Standard Relaxation Labeling Updating Schemes • XII. Summary and Conclusions • Appendix A

Motivation • Two concerns: • The decomposition of a complex computation into a network of simple “myopic”, or local, computations • The requisite use of context in resolving ambiguities

Motivation • Relaxation operations: To solve systems of linear equations, etc. • Relaxation labeling: • Extension of relaxation operations • Solutions involve symbols rather than functions. • Assign weights attached to labels • Main difference: Labels do not necessarily have a natural ordering

Motivation • Algorithm: • Parallel • Each process makes use of the context to assist in a labeling decision • Goal • Provide a formal foundation • Characterize of what the algorithm is doing to attribute the cause of failure to an inadequate theory

Motivation • Treatment • Abstract • To relate discrete relaxation to a description of the usual relaxation labeling schemes • To develop a theory of consistency • To formalize its relationship to optimization • Several mathematical results

I. Introduction to Labeling Problems • In a labeling problem, one is given: • A set of objects • A set of labels for each object • A neighbor relation over the objects • A constraint relation over labels at pairs (or n-tuples) of neighboring objects • Solution: An assignment of labels to each object in a manner which is consistent with respect to the constraint relation

I. Introduction to Labeling Problems • λ: Variable to either denote a label or to serve as an index through a set of labels. • Λi : Set of labels attached to node i • Λij : Constraint relation listing all pairs (λ,λ’) such that λat i is consistent with λ’ at j • m : Number of labels inΛi • n : Number of nodes in G • Si (λ) : Support function for label λon i from a discrete labeling (count the number of neighbors of an object i which has labels compatible to a given label λat i) • Max used because more than one label can be 1 at j.

I. Introduction to Labeling Problems • Discrete relaxation • label discarding rule: discard a label λat a node i if there exists a neighbor j of i such that every label λ’ currently assigned to j is incompatible with λ at i ( for all λ’ assigned to j). • A label is retained if at every neighboring node there exists at least one compatible label.

II. Continuous Relaxation Labeling Processes • Limit in I: • Pairs of labels are either compatible or completely incompatible • Can’t express a preference or relative dislike • Solution: • Continuous relaxation labeling • Weighted values representing relative preferences

II. Continuous Relaxation Labeling Processes • Compatibility rij(λ,λ’) : relative support for label λat object i that arises from label λ’ at object j. • Positive: locally consistent pair • Negative: implied inconsistency • Magnitude of rij(λ,λ’) is proportional to the strength of the constraint • i and j are not neighbors: rij(λ,λ’) = 0

II. Continuous Relaxation Labeling Processes • Difficulty: Formulating a consistent labeling • A consistent labeling is one in which the constraints are satisfied • Logical constraints replaced by weighted assertions: A new foundation is required to describe the structural framework and the precise meaning of the goal of consistency

II. Continuous Relaxation Labeling Processes • Structural frameworks attempted: • Define consistency as the stopping points of algorithm • Circular, no clue • Regard the label weights as probabilities, use Bayesian analysis, statistical quantities, etc. • Unsuccessful, various independence assumptions required • Optimization theory: a vector composed of the current label weights, an evidence vector involving each label’s neighborhood weights • Authors extended it • Linear programming: constraints are obtained from arithmetical equivalents, preferences can be incorporated only by adding new labels • Different, interesting and not incompatible with authors’ development

II. Continuous Relaxation Labeling Processes • Prototype (original) algorithm: • An iterative , parallel procedure analogous to the label discarding rule used in discrete relaxation • For each object and each label, one computes (as support function) using the current assignment values pi(λ). Then new assignment values are defined according to

III. Consistency • Require a system of inequalities • Permit the logical constraints to be ordered, or weighted • Allow an analytic, rather than logical or symbolic, study • Definition of consistency: • For unambiguous labelings • For weighted labeling assignments

III. Consistency • Unambiguous labeling assignment: A mapping from the set of objects into the set of all labels, each object is associated with exactly one label • Space of unambiguous labelings:

III. Consistency • Weighted labeling assignments: replace by the condition • K is simply the convex hull of K*

III. Consistency • Consistency depends on constraints between label numbers: the compatibility matrix, elements of which indicate both positive and negative constraints. • Definition 3.1: Labeling spaces require , so replace max with a sum in support function (linear) (refer to I)

III. Consistency • Higher order combinations of object labels: • Multidimensional matrix of compatibilities: • Support at object i for label λ: • Definition 3.2: The unambiguous labeling is consistent providing • Consistency in K* corresponds to satisfying a system of inequalities:

III. Consistency • At a consistent unambiguous labeling, the support, at each object, for the assigned label is the maximum support at that object. • Given a set of objects, labels, and support functions, there may be many consistent labelings. • Condition for consistency in K* (restate)

III. Consistency • Definition 3.3: Condition for consistency for weighted labeling assignment • Definition 3.4: Condition for strictly consistency (for ) • An unambiguous assignment that is consistent in K will also be consistent in K*, since . The converse is also true (3.5).

III. Consistency • Proposition 3.5: An unambiguous labeling which is consistent in K* is also consistent in K.

IV. Overview of Results • Algorithm for converting a given labeling into a consistent one: • Two approaches: • Optimization theory • Finite variational calculus • Lead to the same algorithm • Achieving consistency is equivalent to solving a variational inequality:

IV. Overview of Results • Two paths to study consistency and derive algorithms for achieving it.

V. Average Local Consistency • Goal: Update a nearly consistent labeling to a consistent one • should be large => should be large => • Average local consistency should be large. • Two problems: • Maximizing a sum doesn’t necessarily maximize each individual terms • The individual components si(λ) depend on , which varies during the maximization process.

V. Average Local Consistency • Maximizing is the same as maximizing ,which is not the same as maximizing the n quantities

V. Average Local Consistency • Special case: the compatibility matrix is symmetric, maximizing leads to consistent labeling assignments. • General case: the compatibility matrix is not symmetric. VIII will figure out algorithm. • Locally maximizes is the same as if the matrix is symmetrized.

V. Average Local Consistency • Gradient ascent: to find local maxima of a smooth functional , which successively move the current by a small step to a new . • The amount of increase in is related to the directional derivative of A in the direction of step. • The gradient :

V. Average Local Consistency • When the compatibilities are symmetric: • (cmp Dfn 3.1) • : intermediate updating “direction”

VI. Geometric Structure of Assignment Space • Goal: To discuss gradient ascent on K, and to visualize the more general updating algorithms. • A simple example: 2 (n) objects, with 3 (m) possible labels for each object (2 - simplex)

VI. Geometric Structure of Assignment Space • Vector : two points, each lying in a copy of the space shown in Fig.2. • K: set of all pairs of points in two copies of the triangular space in Fig.2 • K with n objects each with m labels: • Space: n copies of an (m-1)-simplex • K: set of all n-tuples of points, each points lying in a copy of the (m-1)-dimensional surface • A weighted labeling assignment is a point in the assignment space K. • An unambiguous labeling: one of the “corners” • Each simplex has m corners

VI. Geometric Structure of Assignment Space • Tangent space: A surface lies “tangent” to the entire surface if place it at the given point, means the set of all directions • K and tangent space are coincide when initiate • Interior of a surface: a vector space • Boundary of surface: a convex subset of a vector space

VI. Geometric Structure of Assignment Space • : A labeling assignment in K • : Any other assignment in K • Difference vector (direction):

VI. Geometric Structure of Assignment Space • Set of all tangent vectors at (surface)( roams around K): • Set of tangent vectors at the interior point consists of an entire subspace:

VI. Geometric Structure of Assignment Space • lies on a boundary of K: a proper subset of above space:

VII. Maximizing Average Local Consistency • To find a consistent labeling: • Constraints are symmetric: Gradient ascent • Constraint are not symmetric: same algorithm (VIII) • The increase in due to a small step of length αin the direction ūis approximately the directional derivative: ||u|| = 1 (the greatest increase in can be expected if a step is taken in the tangent direction ū)

VII. Maximizing Average Local Consistency • To find direction of steepest ascent: grad should be maximized (solution always exists)

VII. Maximizing Average Local Consistency • Lemma 7.3: If lies in the interior of K, then the following algorithm solves problem 7.1 • May fail when is a boundary point of K (solved using algorithm in Appendix A)

Appendix A. Updating Direction Algorithm • Give algorithm to replace the updating formulas in common use in relaxation labeling processes. • Give projection operator (a finite iterative algo) based on consistency theory and permitting proof of convergence results. • Solution to the projection problem: returned vector u. • Normalization: ||ū|| = 1 (or ū = 0) • Step length: αi

VII. Maximizing Average Local Consistency • Algorithm 7.4: find consistent labelings when the matrix of compatibilities is symmetric • Successive iterates are obtained by moving a small step in the direction of the projection of the gradient • Algorithm stops when the projection = 0

VII. Maximizing Average Local Consistency • Proposition 7.5: Suppose is a stopping point of Algo 7.4, then if the matrix of compatibilities is symmetric, is consistent.

VIII. The Relaxation Labeling Algorithm • Previous entire analysis of average local consistency relies on the assumption of symmetric compatibilities. • Example: constraints between letters in English • Theorem 4.1 is general (variational inequality)

VIII. The Relaxation Labeling Algorithm • Observation 8.1 With defined as above, the variational inequality is equivalent to the statement A labeling is consistent iff points away from all tangent directions • Algorithm 8.2 (The Relaxation Labeling Algorithm)

VIII. The Relaxation Labeling Algorithm • Proposition 8.3: suppose is a stopping point of Algo 8.2, then is consistent. • Questions: • Are there any consistent labeling for the relaxation labeling algorithm to find? (Answered by 8.4) • Assuming that such points exist, will the algorithm find them? (answered in IX) • Even if a relaxation labeling process converges to a consistent labeling, is the final labeling better than the initial assignment? (not well defined)

VIII. The Relaxation Labeling Algorithm • Example of English • Proposition 8.4: The variational inequality of Theorem 4.1 always has at least one solution. Thus consistent labelings always exist, for arbitrary compatibility matrices. • Usually, more than one solution will exist.

IX. A Local Convergence Result • As the step size of the relaxation labeling algorithm 7.4 or 8.2 becomes infinitesimal, these discrete algorithms approximate dynamical system • Hypothesis of 9.1: the labeling at every object is close to the consistent assignment

IX. A Local Convergence Result • Assume that is strictly consistent in order to prove that it’s a local attractor of the relaxation labeling dynamical system • If is consistent, but not strictly consistent, maybe: • A local attractor of the dynamical system • A saddle point • An unstable stopping point

X. Generalizations to Higher Order Compatibilities • Consistency: be defined using support functions (depend on arbitrary orders of compatibilities): • 1-order compatibilities: • 3-order: • Symmetry condition:

X. Generalizations to Higher Order Compatibilities • k-order compatibilities: • Symmetry condition:

This is an example of a bad talk