Independent Component Analysis For Track Classification

Independent Component Analysis For Track Classification Seeding for Kalman Filter High Level Trigger Tracklets After Hough Transformation A K Mohanty

Outline of the presentation • What is ICA • Results (TPC as a test case) • Why ICA has worked ? a. Unsupervised Linear Learning b. Similarity with Neural net (both supervised and unsupervised) A K Mohanty

Let me define the problem m • m---Measurements • N----No. of tracks • We have to decide N good track out of Nm combinations S=WX N If si are independent, true tracks have certain characteristic which is not found for ghost tracks Find W which is a matrix of m rows and m columns A K Mohanty

Definition of Independence Consider any two random variables y1 and y2. If independent p(y1,y2)=p1(y1)p2(y2) This is true for any n number of variables. This would imply that the independent variables should satisfy E{f1(y1)f2(y2)…}=E{f1(y1)}E{f2(y2)} Weaker definition of independence is uncorrelated ness. Two variables are uncorrelated if their covariance zero E{y1y2}-E{y1}E{y2}=0 A fundamental restriction is independent component must be non Gaussian for ICA to be possible A K Mohanty

How do we achieve Independence ? Define Mutual Information I which is related to the differential Entropy H Entropy is the basic concept of Information theory. Gaussian variables has the largest entropy among all random variables of equal variance. Look for a transformation which deviates from Gaussianity . K=E{y4}-3(E{y2})2 . Hyvarinen A and E. Oja, Neural Networks, 13, 411, 2000 A K Mohanty

Steps Involved: • Centering • (Subs tract the mean so as to make X as zero mean variable) • Whitening • (Transform the observed vectorXtoY=AX whereYis white. Itscomponent are uncorrelated with unity variance.) • The above two steps corresponds to the Principal Component Transformation where A is the matrix that diagonalises the covariance matrix of X. • Choose an initial random weight vector W. • Let W+=E{Y g(WTY)}-E{g’(WTY)}W • Let W=W+/||W+|| • If not converged go back to 4 A K Mohanty

X-Y Distribution Projection of fast points on X-Y plane Only high PT tracks are being considered to start with. Only 9 rows of outer sectors are taken. A K Mohanty

Conformal Mapping Circle Straight line To reduce the number of combinatorics A K Mohanty

Global Tracklet I Tracket II Tracklet III Generalized Distance after PCA transformation A K Mohanty

Global Tracking after PCA A K Mohanty

In parameter space At this stage variables are only uncorrelated, not independent. They can be made independent by maximizing the entropy A K Mohanty

Independent Uncorrelated A=wT W W is a matrix and w is a vector A K Mohanty

A K Mohanty

ICA transformation PCA Transformation A K Mohanty

True Tracks False Tracks A K Mohanty

Why ICA has worked ? Output Layer Hidden layer Input Layer • Principal Component Transformation • (variables become un-correlated) • Entropy Maximization • (variables become independent) Linear Neural Net Unsupervised Learning A K Mohanty

Non Linear Neural Network (Supervised learning) Output Layer; 1 if true 0 if false Hidden Layer Input Layer • At each node, use a non linear sigmoid function • Adjust the weight matrix so that the cost function is minimized A K Mohanty

Original Inputs Independent Inputs Neural net learns faster when the inputs are mutually independent. This is a basic and important requirement for any multilayer neural net. A K Mohanty

Out put of neural net during training A K Mohanty

False True Classification using supervised neural net A K Mohanty

Conclusions: a. ICA has better discriminatory features which can extract good tracks either eliminating or minimizing the false combinatorics depending on the multiplicity of the events. b. ICA which learns in a unsupervised way can also be used as a preprocessor for more advanced non-linear neural nets to improve the performance. A K Mohanty

Independent Component Analysis For Track Classification