180 likes | 199 Views
Thesis proposals Group of Applied Statistics. Models and methods for the analysis of complex and high-dimensional data, with applications to life-sciences Laura M. Sangalli and Francesca Ieva. Functional data: where they come from.
E N D
Thesis proposals Group of Applied Statistics Models and methods for the analysis of complex and high-dimensional data, with applications to life-sciences Laura M. Sangalli and Francesca Ieva
Functional data: where they come from Explosive growth in recording complex and high-dimensional data, e.g., having a functional nature (i.e., representable by curves, surfaces, dynamic curves and surfaces), non-euclidean data 2D and 3D images and measures captured in time and space ► images of the internal structures of a body provided by diagnostic medical scanners Magnetic Risonance Imaging of a brain during a reading task Aston, Turkheimer, Brett (2006) Hum. Brain Map. Reconstruction of an inner carotid artery with aneurysm, from angiographic images Sangalli, Secchi, Vantini, Veneziani (2009) J. R. Stat. Soc. Ser. C
Functional data: where they come from ►measurements of gene expression levels ► large speech databases describing linguistic constructs expressed via spectrum data Aston, Chiou, Evans (2010) J. R. Stat. Soc. Ser. C The ENCODE Project Consortium, 2011, PLoS Biology The analysis of complex and high dimensional data poses new and challenging problems in research It is fueling one of the most fascinating and fast growing research fields of modern statistics
4 ► Advanced statistical and numerical methods for the analysis of high dimensional functional data, spatial data and complex object data Study of hemodynamical signals over cerebral cortex Study pathogenesis of atherosclerotic plaques MAthematichs for CARotid ENdarterectomy @ MOX Study of cerebral aneurysms pathogenesis
5 ► Advanced statistical and numerical methods for the analysis of high dimensional functional data, spatial data and complex object data ► Space/time regression models with partial differential equation regularization ► Estimation problem solved resorting to Finite Elements or NURBS (iso-geometric analysis) ► Strong interplay of statistics with techniques from both pure math (analysis and differential geometry) and applied math (numerical analysis and scientific computing), and engineering
Waste data over the Venice province Study of the pressure over a shuttle winglet Crime locations in Portland, Oregon, in 2012 [courtesy of Pierre Wilhelm, S3, Swiss Space Systems Holding SA]
A FEW POSSIBLE TOPICS (many others can be considered, according to the candidate interests) ► Extension to volume data, using volumetric FE (tetrahedral, hexahedral elements, or more advanced hybrid elements that might be particularly convenient for specific fields of applications): possible applications e.g. to neurosciences, to the study of neuroimaging signal in the brain volume, accurately complying with the complex shape of this organ ► Extension to complex data objects (e.g., manifold valued data): possible applications in the neurosciences and earth-sciences ► Extension to multiple realizations of the signals over the complex domain, to allow for population study: possible applications in the neurosciences and other life-sciences ► Estimation of the iperparameters in PDE regularizing term (connections with the field of uncertainty quantification in complex models governed by PDEs) ► Derivation of explicit formulation for space covariance structure implied by differential regularization ► Study of regularity conditions ensuring that the differential regularization provides a valid covariance model ► Study of asymptotic properties [Available code: R package (with C++ code), Matlab]
► Thesis can be • - mostly or solelyapplied work • - mostly implementative work • - mostlymethodological/theoretical work • - anydesired mix of the above • ► Thesis can be • - mostly on the statistical side • - mostly on the numericalanalysis side (with co-supervision by a colleague in Numerical Analysis) • - anydesired mix of the above • ► Thesis can be combined with the courseprojects of • - AppliedStatistics • - Advanced Programming For Scientific Computing • - Numerical Analysis for Partial Differential Equations
Data Pre-hospital phase In-hospital phase Discharge phase
PROMETEO (PROgetto sull’area Milanese Elettrocardiogrammi Teletrasferiti dall’ExtraOspedaliero). Multivariate Functional Data with dependent components Pre-hospital phase • Depth Measures • Robust Statistics • Nonparametric measures of dependency • Bayesian Functional Clustering Esempio tesi: Utilizzo di misure nonparametriche di asociazione per la valutazione dei pattern di dipendenza in dati funzionali multivariati.
Progetto Scompenso (2012-2014) Utilization of Regional Health Service Databases for evaluating Epidemiology, short- and medium-term outcomes and process indexes for patients hospitalized for chronic heart failure. Administrative databases play a central role in epidemiological evaluation of health-care systems, due to their widespread diffusion and low cost of information. There is an increasing agreement among epidemiologists on the validity of disease and intervention registries based on administrative databases Aims: Using the hospital admissions data to gain insight into the economic burden of heart failure, how it relates to patient characteristics and how it changes over time. Episodesof hospitalization for HF-related events can act as a surrogate for disease burden when assessing health-care provision. Joint models for mortalityand hospital admissions allow for an improved understanding of the prognosis of patients and enable health-care providers to assess and manage the economicburden of diseases.
Multi State Models for hospitalization dynamics Stage K+1 Multi-State Model Stage 2 Stage 1 m1(t,z(t)) 1 hosp • l11l12 0 … m1 • 0l22l23 0 … m2 • 0l33l34 0 … m3 • 0lKKmK • 0 … 0 l12(t,z(t)) m2(t,z(t)) 2 hosp . . . DEATH Focus:characteristic times of the hospitalization process. mK-1t,z(t)) K-1 hosp • Time to k-th hospitalization (k = 1, …, K) • Transition rate to the absorbing state (death for any cause, intraH or external) mr(t,z(t)) • Sojourn times in the transient states lK-1,K(t,z(t)) mK(t,z(t)) K hosp Srr(t,z(t)) Talks about length of stay in hospital
Proposte di tesi • Clustering in modelli multi stato con frailty non parametriche: un'applicazione al processo di ospedalizzazioni dei pazienti affetti da Scompenso Cardiaco • Analisi e modellazione dei processi di cura in pazienti affetti da scompenso a partire da banche dati amministrative e registri clinici di patologia • Evaluating the effect of Integrated Home Care and/or Intermediate Care Units on repeated all-cause hospitalizations in a cohort of patients hospitalized for Heart Failure in the Trieste area • Network analysis of multi-morbidity patterns in HF patients In collaborazione con: • Università di Trieste e Centro Cardiovascolare (Dott.ssa Giulia Barbati) • DEIB (prof. Carlo Piccardi)
Research question • Which are the characteristics of schools, districts and geographical areas that drive school choice? How can we model the flow of students across different schools? • Data • Hungarian data, students at 8° grade (HLCS first wave – NABC 2006, Census 2011, GEO) – EdEN project • Method(s) • Apply spatial network analysis, in which we model schools as geo-localized nodes of the network and flow of students as edges. FARB & EDEN project Public Management Research: Health and Education Systems Assessment
Esempi di tesi • Robust Statistical Methods in Functional Data Analysis • Network analysis of comorbiditypatterns in HeartFailurepatientsusingadministrative data • Modelli Statistici per Variabili di Conteggio: un’applicazione ai Processi di Ospedalizzazione di Pazienti affetti da Scompenso Cardiaco • Statistical methods For dealing with missing data; an application to INVALSI data on students' achievements in maths • Analisi Comparativa di Misure di Profondità per Dati Funzionali Multivariati • Models for PredictingReadmissions in HeartFailurePatients: a Comparisonbetween Lombardia and England • Monitoring and Modeling hospital networks usingadministrative data • The Analysis of Reading and MathsPupils’ Achievements by means of BivariateMultilevelModels • Frailty Multi State Models for analysis of HeartFailuredpatients • Statistical analysis of fnirs data applied to cerebralhemodynamics • Data Mining per Dati Categorici ad Alta Dimensionalità • Hazardreconstruction and clustering for betterprognosis of diseaseprogression in heartfailure • Uso delle misure di profondità per dati funzionali multivariati nella previsione di patologie: un'applicazione ai segnali elettrocardiografici • Metodi Numerici e Statistici per la Simulazione e la Validazione di ECG • Modelli BayesianiSemiparametrici Multivariati per le Probabilità di Sopravvivenza in seguito ad Infarto Miocardico Acuto
Contacts Anna Maria Paganoni Piercesare Secchi Laura Sangalli Simone Vantini Francesca Ieva Alessandra Menafoglio