Singular Value Decomposition and Data Management

Singular Value Decomposition and Data Management

SVD - Detailed outline • Motivation • Definition - properties • Interpretation • Complexity • Case studies • Additional properties

SVD - Motivation • problem #1: text - LSI: find ‘concepts’ • problem #2: compression / dim. reduction

SVD - Motivation • problem #1: text - LSI: find ‘concepts’

SVD - Motivation • problem #2: compress / reduce dimensionality

Problem - specs • ~10**6 rows; ~10**3 columns; no updates; • random access to any cell(s) ; small error: OK

SVD - Motivation

SVD - Definition A[n x m] = U[n x r]L [ r x r] (V[m x r])T • A: n x m matrix (eg., n documents, m terms) • U: n x r matrix (n documents, r concepts) • L: r x r diagonal matrix (strength of each ‘concept’) (r : rank of the matrix) • V: m x r matrix (m terms, r concepts)

SVD - Properties THEOREM [Press+92]:always possible to decomposematrix A into A = ULVT , where • U,L,V: unique (*) • U, V: column orthonormal (ie., columns are unit vectors, orthogonal to each other) • UTU = I; VTV = I (I: identity matrix) • L: singular values, non-negative and sorted in decreasing order

SVD - Example • A = ULVT - example: retrieval inf. lung brain data CS x x = MD

SVD - Example • A = ULVT - example: retrieval CS-concept inf. lung MD-concept brain data CS x x = MD

SVD - Example doc-to-concept similarity matrix • A = ULVT - example: retrieval CS-concept inf. lung MD-concept brain data CS x x = MD

SVD - Example • A = ULVT - example: retrieval ‘strength’ of CS-concept inf. lung brain data CS x x = MD

SVD - Example • A = ULVT - example: term-to-concept similarity matrix retrieval inf. lung brain data CS-concept CS x x = MD

SVD - Detailed outline • Motivation • Definition - properties • Interpretation • Complexity • Case studies • Additional properties

SVD - Interpretation #1 ‘documents’, ‘terms’ and ‘concepts’: • U: document-to-concept similarity matrix • V: term-to-concept sim. matrix • L: its diagonal elements: ‘strength’ of each concept

SVD - Interpretation #2 • best axis to project on: (‘best’ = min sum of squares of projection errors)

SVD - Motivation

minimum RMS error SVD - interpretation #2 SVD: gives best axis to project v1

SVD - Interpretation #2

x x = v1 SVD - Interpretation #2 • A = ULVT - example:

SVD - Interpretation #2 • A = ULVT - example: variance (‘spread’) on the v1 axis x x =

SVD - Interpretation #2 • A = ULVT - example: • UL gives the coordinates of the points in the projection axis x x =

x x = SVD - Interpretation #2 • More details • Q: how exactly is dim. reduction done?

SVD - Interpretation #2 • More details • Q: how exactly is dim. reduction done? • A: set the smallest singular values to zero: x x =

SVD - Interpretation #2 x x ~

SVD - Interpretation #2 ~

SVD - Interpretation #2 Equivalent: ‘spectral decomposition’ of the matrix: x x =

SVD - Interpretation #2 Equivalent: ‘spectral decomposition’ of the matrix: l1 x x = u1 u2 l2 v1 v2

l1 l2 u1 u2 vT1 vT2 SVD - Interpretation #2 Equivalent: ‘spectral decomposition’ of the matrix: m = + +... n

l1 l2 u1 u2 vT1 vT2 SVD - Interpretation #2 ‘spectral decomposition’ of the matrix: m r terms = + +... n n x 1 1 x m

l1 l2 u1 u2 vT1 vT2 SVD - Interpretation #2 approximation / dim. reduction: by keeping the first few terms (Q: how many?) m To do the mapping you use VT X’ = VT X = + +... n assume: l1 >= l2 >= ...

l1 l2 u1 u2 vT1 vT2 SVD - Interpretation #2 A (heuristic - [Fukunaga]): keep 80-90% of ‘energy’ (= sum of squares of li ’s) m = + +... n assume: l1 >= l2 >= ...

SVD - Interpretation #3 • finds non-zero ‘blobs’ in a data matrix x x =

SVD - Interpretation #3 • Drill: find the SVD, ‘by inspection’! • Q: rank = ?? ?? x x = ?? ??

SVD - Interpretation #3 • A: rank = 2 (2 linearly independent rows/cols) ?? x x = ?? ?? ??

SVD - Interpretation #3 • A: rank = 2 (2 linearly independent rows/cols) x x = orthogonal??

SVD - Interpretation #3 • column vectors: are orthogonal - but not unit vectors: 0 0 x x 0 = 0 0 0 0 0 0 0

SVD - Interpretation #3 • and the singular values are: 0 0 x x 0 = 0 0 0 0 0 0 0

SVD - Interpretation #3 • A: SVD properties: • matrix product should give back matrix A • matrix U should be column-orthonormal, i.e., columns should be unit vectors, orthogonal to each other • ditto for matrix V • matrixLshould be diagonal, with positive values

SVD - Complexity • O( n * m * m) or O( n * n * m) (whichever is less) • less work, if we just want singular values • or if we want first k left singular vectors • or if the matrix is sparse [Berry] • Implemented: in any linear algebra package (LINPACK, matlab, Splus, mathematica ...)

Optimality of SVD Def: TheFrobenius norm of a n x m matrix M is (reminder) The rank of a matrix M is the number of independent rows (or columns) of M Let A=ULVT and Ak = UkLk VkT (SVD approximation of A) Ak is annxm matrix, Uk an nxk, Lk kxk, and Vk mxk Theorem: [Eckart and Young] Among all n x m matrices C of rank at most k, we have that:

Kleinberg’s Algorithm • Main idea: In many cases, when you search the web using some terms, the most relevant pages may not contain this term (or contain the term only a few times) • Harvard : www.harvard.edu • Search Engines: yahoo, google, altavista • Authorities and hubs

Kleinberg’s algorithm • Problem dfn: given the web and a query • find the most ‘authoritative’ web pages for this query Step 0: find all pages containing the query terms (root set) Step 1: expand by one move forward and backward (base set)

Kleinberg’s algorithm • Step 1: expand by one move forward and backward

Singular Value Decomposition and Data Management

Singular Value Decomposition and Data Management

Presentation Transcript

Singular Value Decomposition

Singular Value Decomposition in Text Mining

Image Compression by Singular Value Decomposition

Eigen Decomposition and Singular Value Decomposition

SVD(Singular Value Decomposition) and Its Applications

Matrix Factorizations: Singular Value Decomposition

Image Compression using Singular Value Decomposition

SINGULAR VALUE DECOMPOSITION (SVD)

Eigen Decomposition and Singular Value Decomposition

SVD: Singular Value Decomposition

Singular Value Decomposition SVD

Singular Value Decomposition

A Multilinear Singular Value Decomposition

Eigen Decomposition and Singular Value Decomposition

SVD(Singular Value Decomposition) and Its Applications

Singular Value Decomposition SVD

SVD: Singular Value Decomposition

Singular Value Decomposition

Singular Value Decomposition