ALTRO data preparation and clustering algorithms

ALTRO data preparation and clustering algorithms Marco Villa – CERN 24th May 2011

Outline • ALTRO data clean–up: • Data analysis paradigm & clean–up framework • Implementing the clean–up • Selecting the cuts • Choosing the distributions • Clustering algorithms: • Purpose & boundary conditions • Agglomerative hierarchical approach • Current implementation

ALTRO data clean–up

Paradigm & framework Data analysis paradigm Clean–up framework • Different readout electronics used (ALTRO, APV, BNL) • Interesting data from all electronics • Not all chambers fully tested with all electronics • Redundant info in ALTRO  Common data format will avoid analysis code replication

Implementing the clean–up • ALTRO ntuples have 2 sets of values for each firing channel: high gain and low gain • Each signal is fitted and fit results are stored • Clean–up must take care of selecting the best charge and time value for each strip: • Use fit charge from high gain • Use fit charge from low gain (rescaled) • Use sample charge from high gain • Use sample charge from low gain (rescaled) • Item unusable (label as “–1”)

Selecting the cuts • Items validation through cuts: • For all high gain values: • Overflow cut @ qs=1000 (ADC saturation @ 1023) • For all fit values: • Low tau cut @ =2 • High tau cut @ =4 • Fitness cut (F = fit, S = sample, P = pedestal): • F – S + P = 0 (my original distribution) • F / (S – P) = 1 • S / (F + P) = 1 (Dimos’ distribution) • P / (S – F) = 1

Choosing the distributions (1) High gain value Low gain value 132 %   Runs 4857, 4858, 4861, 4867, 4868: R12, Ar:CO2 85:15, 0 angle, 570 / 870 V

Clustering algorithms

Purpose & boundary conditions • Purpose: coding a software module that, given a data file in “standard format”, produces an output file with data and cluster information • Boundary conditions: 0 impact angle

Agglomerative hierarchical method • Hierarchical clustering seeks to build a hierarchy of clusters. It can be of 2 types: • Divisive: top–down approach, in which all observations start in one cluster, and splits are performed recursively • Agglomerative: bottom–up approach, in which each observations starts in its own cluster, and clusters are merged (in pairs) • Using an agglomerative algorithm with custom step–0 clustering (primary clusters)

Current implementation • Step–0: primary clusters are formed from neighboring firing strips; • Iterative merging (in pairs): • if clusters are “close”: • if both clusters have unitary size ask user • if only one cluster has unitary size check amplitude • if no cluster has unitary size: • Set proper starting points and boundaries for fits • Fit both clusters with a gaussian • if gaussians are not resolved then merge clusters

Conclusions & outlooks • ALTRO data clean–up: • Framework ready • Selection works and produces clean output files in “standard format” • Timing correction can be implemented • Clustering algorithms: • Framework ready • Primary clustering works fine • Hierarchical clustering works in most of the cases, only needs some parameter tuning

ALTRO data preparation and clustering algorithms

ALTRO data preparation and clustering algorithms

Presentation Transcript

Clustering Algorithms

Comparing Clustering Algorithms

Supervised Clustering --- Algorithms and Applications

Clustering Algorithms for Categorical Data Sets

Clustering Algorithms

Clustering Algorithms

Anonymization Algorithms - Microaggregation and Clustering

Clustering Algorithms

Fuzzy Clustering Algorithms

Clustering Algorithms BIRCH and CURE

Clustering Algorithms

Clustering Algorithms

Clustering Algorithms

Clustering Algorithms

Clustering Algorithms

Clustering Algorithms

Clustering Algorithms

Clustering Algorithms