630 likes | 698 Views
Explore the geometric perspective on optimal transportation and generative models, including GANs and Wasserstein distance. Learn about key figures like Kantorovich and Monge, and concepts such as transportation plans and Kantorovich duality. Discover practical examples, competitive pricing strategies, -convexity, and Lipschitz functions. Unpack the dual Kantorovich problem and delve into the Kantorovich-Rubinstein formula.
E N D
GAN & Optimal Transportation Zhizhong LI Nov 23, 2017
A geometric view of Optimal transportation and generative model • Na Lei, Kehua Sun, Li Cui, Shing-Tung Yau, David Xianfeng Gu • References • Santambrogio, Filippo. "Optimal transport for applied mathematicians." Birkäuser, NY (2015). • Villani, Cédric. Optimal transport: old and new. Vol. 338. Springer Science & Business Media, 2008. • Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014. • Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein generative adversarial networks." International Conference on Machine Learning. 2017.
Glossary ConceptsDefinitions
Ian Goodfellow • Research scientist at Google Brain
Monge • (9 May 1746 – 28 July 1818) • Inventor of descriptive geometry (mathematical basis of technical drawing) • Father of differential geometry • Minister of the Marine
Monge’s Problem • Given two densities of mass on , with , • Find a map , pushing the first to the other, • And minimizing the cost • (note: two densities are for earth and embankment)
Push-forward operator • If is a Borel measure, is a Borel map. • Then the push-forward of is a measure on , defined by
Measure-Preserving Map • : metric spaces with probability measures , respectively. • is measure preserving if measurable set , • can be written as
Monge’s Optimal Mass Transport • Given a transportation cost function , find the measure preserving map that minimizes the total transportation cost
Kantorovich • (19 January 1912 – 7 April 1986) • Founder of linear programming • Stalin Prize in 1949 • Nobel Memorial Prize in Economic Sciences in 1975
Transportation Plan • Any strategy for sending to can be represented by a joint measure on , • is the transport plan of the amount of moving from to . • Equivalent written in terms of projections and ,
Kantorovich’s Problem • Finding the transportation plan that minimizing the transportation cost • It is a relaxation of Monge problem in that it allows more general movement. • is not a distance in general. E.g. when .
Roland Dobrushin • (July 20, 1929 – November 12, 1995) • A mathematician who made important contributions to probability theory, mathematical physics, and information theory. • The name "Wasserstein distance" was coined by R. L. Dobrushin in 1970.
Leonid Vaseršteĭn • Professor of Mathematics at Penn State University • He is well known for providing a simple proof of the Quillen–Suslin theorem, a result in commutative algebra.
Wasserstein distance • When the cost is defined in terms of a distance, is a distance between and . Called the Wasserstein distance. In general, for ,
Dual Kantorovich Problem • The Kantorovich problem is a linear optimization under linear constraints. • The dual problem is • Where , . • is called the Kantorovich potential.
Sun Wukong • 72 Di Sha transformation • Jin Dou Cloud • Eye of Truth
Example Explained • A company managing both bakeries (located at ) and cafés (located at ). • The transportation cost of sending bread from bakery to café is . • The primal Kantorovich problem is to minimize transportation cost.
Example Explained – Competitive pricing • Consider Sun Wukong want to build a business like this: • 1) buy breads from bakeries (say, bakery ) at price , and • 2) sell them to cafés (say, café ) at price • A competitive pricing is
Example Explained – inequality • With the help of Wukong, the transportation cost for the company is lower than the previous optimal transport plan
Example Explained – tight price • To get maximum profit, Wukong need the pair of prices and be tight • It would be great if he can achieve upper bound in previous slide’s equation. • To do that, Wukong only need to consider tight pairs. • Tight pairs only exist if is -convex.
convex • A function is convex only if it is a supreme of a family of linear functions • Which are parametrized by gradients , and determined by intercept .
-convex • Given function • A function is -convex only if exists , s.t. • The graph of a -convex function can entirely caress from below with a tool whose shape is negative of cost function.
Lipschitz • (14 May 1832 – 7 October 1903) • Known for • Lipschitz continuity • Lipschitz integral condition • Lipschitz quaternion
Example of -Convexity • , convex functions • , -Lipschitz functions
-transform • For -convex function , the -transform is • , Legendre transform • , identity functor • for tight pairs
Kantorovich Duality • Under some condition, there is duality
Kantorovich potential • is called the Kantorovich potential. • Once we know the optimal , we know the optimal cost.
G. S. Rubinstein • Professor
Kantorovich-Rubinstein formula • When , , and -convexity is -Lipschitz, • Note that is the Kantorovich potential. • This equation is the loss for discriminator of WGAN.
Martin Arjovsky • PhD student at the Courant Institute of Mathematical Sciences.
WGAN Discriminator Loss • : data distribution • : generated distribution • : discriminator network with weight clipping
Brenier • (born January 1st 1957) • Centre de mathématiques Laurent Schwartz, • Ecole Polytechnique, FR-91128 Palaiseau, France
Brenier potential • If , and , where strictly convex, then the optimal transport plan • send each to a unique destination , • is the gradient of a convex function . This is called the Brenier potential. • Note is not ok.
Relation between potentials and • Note for Kantorovich potential • When is optimal in the dual Kantorovich problem, infimum is achieved. • For , the Brenier potential is ,
Minkovski • (22 June 1864 – 12 January 1909) • Created and developed the geometry of numbers. • He showed that his former student's, Albert Einstein, special theory of relativity could be understood geometrically as a theory of four-dimensional space–time, since known as the "Minkowski spacetime".
Minkovski problem • Suppose are unit vectors which span , • s.t. . Then • Exists convex polytope with codim-faces, s.t., • are their outward normal, and • are their volume.
Alexandrov • (August 4, 1912 – July 27, 1999) • Stalin Prize (1942) • Lobachevsky International Prize (1951) • Euler Gold Medal of the Russian Academy of Sciences (1992)
Alexandrov problem • compact convex polytope • , -th coordinates are negative • , s.t. , then • Exists convex polytope with codim-faces, s.t., • are normal of faces, • are volume of intersection of and projected faces.
Xianfeng Gu • Associate Professor • Department of Computer Science • Department of Applied Mathematics • State University of New York at Stony Brook
Feng Luo • Professor of Mathematics, Rutgers University
Jian SUn • Associate Professor at Tsinghua University
Shing-Tung Yau • (born April 4, 1949) • Director, The Institute of Mathematical Sciences, The Chinese University of Hong Kong • Department of Mathematics, Harvard University • Fields Medal in 1982 • Wolf Prize in Mathematics in 2010
Convex Polytope by Linear function • Given , and , the graph of • is a convex polytope in . • The projection induces a cell decomposition of . • Given probability measure on , volume of is defined as
Gu-Luo-Sun-Yau • Let be compact convex domain in , a probability measure on ,be distinct points, then • For any , , Exists , s.t. . • minimizes the quadratic cost • among transport maps , where the Dirac measure .
Semi-discrete optimal mass transport Geometry Transportation Given target locations with probability weight . Find transportation map . The transportation map is determined by the Brenier potential . Each cell is mapped to . • Given normal , and each associate with a volume . • Find the convex polytope satisfy normal condition and volume condition. • The convex polytope is determined by the graph of .
Constructive result • Alexandrov’s proof is non-variational and non-constructive. • Gu-Luo-Sun-Yau’s theorem has a variational proof and produce an algorithm for finding , and thus the Brenier potential . • In fact, is the maximum of concave function • On a open convex set.
Claims What Does It Want To Say