370 likes | 386 Views
Delve into the world of statistics with Prof. K.K. Achary from Yenepoya Research Centre, Yenepoya University. Learn about the scope and applications of statistics, its historical significance, and key contributors in the field.
E N D
Lectures delivered to Ph.D. Course work students-July 2015 batch By Prof.K.K.Achary Yenepoya Research Centre Yenepoya University Prof.K.K.Achary,YRC
Statistics – Definition & Scope • Scientific study of numerical data based on natural phenomena • Science of collecting and analysingnumerical data in large quantities,especially for the purpose of drawing inferences and decision making • Statistics is the study of collection,organisation, analysis, interpretation and presentation of data. Prof.K.K.Achary,YRC
Statistics is the science whereby inferences are made about specific random phenomena, on the basis of relatively limited sample data. • Statistics is the science of learning from data, and measuring, controlling and communicating uncertainty; and thereby provide the navigation essential for controlling the course of scientific and social advances ( American Statistical Association) Prof.K.K.Achary,YRC
The word ‘statistics’ is understood in two different ways • As a singular noun it refers to the subject /discipline/branch of study • In plural sense it refers to collected facts or information, i.e.data/summary based on data • When we use in singular sense, it is written as “Statistics” Prof.K.K.Achary,YRC
What are the different views? • Mathematical Statistics – mainly deals with developingtheories,models,techniques, computational algorithms etc. • Applied Statistics -- deals with application of statistical methodology in different areas of study- mostly dealing with natural phenomena wherein numerical facts/data are observed on single or several aspects. Prof.K.K.Achary,YRC
Examples – Applied Stat. • Anthropometry • Agricultural Statistics • Biometry/Biostatistics • Chemometrics • Econometrics • Environmetrics • Forestry Statistics/Fisheries Statistics • Geostatistics • Psychometry • Sociometrics • Technometrics • ------- Prof.K.K.Achary,YRC
Etymology of the word • ‘statistik’ –German word which means’science of state’ or ‘political arithmetic’ • ‘statisticumcollegium’ – Latin word which means ‘ council of states ‘ • ‘statista’ – Italian word meaning ‘statesman’ • All these words mean ‘political state’ • 18th century origin • Historically, Statistics was the ‘science of statecraft’ Prof.K.K.Achary,YRC
What is Biostatistics? • Biostatistics deals with the application of statistical methods to biological/medical data to analyze, interpret and draw inferences/conclusions from the derived results. • It encompasses design and analysis of • biological experiments- randomisedexperiments,clinical trials in biology, medicine,pharmaceuticalscience,agricultural science etc. Prof.K.K.Achary,YRC
Early contributors who are responsible to build strong theoretical foundations to develop Statistical theory and its applications are coming from different backgrounds– mostly mahtematicians, engineers,geneticists,biologists etc. • Most of them are from UK and USA. • Indian statisticians have also made significant contributions • Sir Ronald Aylmer Fisher is called Father of Modern Statistics • Prof.P.C.Mahalanobis is called ‘father of statistics in India’ Prof.K.K.Achary,YRC
A genius who almost single-handedly created the foundations for modern statistical science • Statistical methods for Research workers ( 1925 ) • Tests of significance , experimental design etc. Prof.K.K.Achary,YRC
Correlation coefficient • Chi-square test • Foundations of hypothesis testing • Pearson’s system of curves • Started BIOMETRIKA Prof.K.K.Achary,YRC
Regression theory • Psychometry • Inheritance of intelligence • Anthropometrics • Extinction of family names • Karl Pearson was his student Prof.K.K.Achary,YRC
Statistical graphics (used pie chart) • Polar area diagram • Mortality in army due to poor sanitation • First elected female member of Royal Statistical Society Prof.K.K.Achary,YRC
Pen name “Student • Student’s t-distribution& t – test • Design of experiments Prof.K.K.Achary,YRC
Neyman-Pearson which laid the foundation for testing statistical hypothesis • Stratified sampling • Confidence interval Prof.K.K.Achary,YRC
Only son of Karl Pearson • Neyman-Pearson lemma • Likelihood ratio criterion Prof.K.K.Achary,YRC
Father of modern statistics in India • Indian Statistical institute ( 1932 ) • Sample surveys • Pilot survey concept • Mahalanobis distance • Founder Director of ISI Prof.K.K.Achary,YRC
Cramer-rao inequality • Rao-Blackwell theorem • Score test • Worked on most of the emerging areas • Eberly professor at Univ. of Pittsburg • Director of ISI Prof.K.K.Achary,YRC
Kallianpur-Kunita theorem • Kallianpur-Robbins lawKallianpur-Striebel formulaDirector of ISI • A Mangalorean Prof.K.K.Achary,YRC
Block designs • Bose-Mesner algebra • Algebraic analysis and construction of block designs Prof.K.K.Achary,YRC
Considered as the father of modern probability theory • Axiomatic and measure theoretic foundations of probability theory Prof.K.K.Achary,YRC
Major contributions are in the areas of quality control,acceptance sampling and sampling theory Prof.K.K.Achary,YRC
Experimental designs • First female statistician elected to International Statistical Institute Prof.K.K.Achary,YRC
Cooley-Tukey algorithm • Exploratory data analysis • Box plot • Tukey’s test • Tukey’s lambda distribution • Coined the terms”bit” and “software" Prof.K.K.Achary,YRC
Geneticist & evolutionary biologist • Genetic linkage in mammals • Population genetics • Coined the term “clone” • J.B.S. Prof.K.K.Achary,YRC
Geneticist • Path analysis • Inbreeding coefficient • Distribution of gene frequencies( with R.A.Fisher & Haldane ) Prof.K.K.Achary,YRC
If you feel the subject is hard,then follow these tips; • Understand the basic concepts and relate them to your domain • Workout examples using simple data sets • You can learn statistics by working out variety of examples from different areas of interest Prof.K.K.Achary,YRC
The aim of statistics is twofold: • . Descriptive statistics: Summarizing and describing observed data such that the relevant aspects are made explicit. • . Inferential statistics: Studying to what extent observed trends/effects can be generalized to a general (infinite) population Prof.K.K.Achary,YRC
“Data reduction:” Summarize data in compact form • Minimum • Maximum • Mean • Standard deviation • Range, etc. • Various types of visualisation tools –charts,graphs/plots Prof.K.K.Achary,YRC
Techniques make use of probability theory, probability distributions, sampling methods,etc. • Tests of hypothesis, • ANOVA, • Designs of Experiment, • model fitting and prediction ,etc. Prof.K.K.Achary,YRC
Data are distinct pieces of information, usually formatted in a special way . It is collective information (collection of facts or statistics ) • Data is the plural of datum, a single piece of information. In practice, however, people use data as both the singular and plural form of the word. • Data is, generally, numeric ( quantitative ) information in Statistics. But now, it need not be so ! Prof.K.K.Achary,YRC
Height of an individual – single piece of information • Height of a group of 50 individuals – collective information or data/statistics of heights of individuals • You may consider data of name,gender,state of origin/native place,% of marks in qualifying examination,marks in entrance exam,etc. pertaining to the new batch of students admitted to YU. This is collective information from which we can extract lot of additional information or knowledge. • What are the additional information or knowledge you can extract from this data ? Prof.K.K.Achary,YRC
Data are plain facts. When data are processed, organized, structured or presented in a given context so as to make them useful, they are called information. • Data by themselves are fairly useless. But when these data are processed to determine their true meaning, they become useful . This useful information can be called knowledge. Prof.K.K.Achary,YRC
The history of temperature readings all over the world for the past 100 years is data. If this data is organized and analyzed to find that global temperature is rising, then that is information. • The number of visitors to a website by country is an example of data. Finding out that traffic from India is increasing while that from Australia,it is decreasing is meaningful information. • Data could be primary or secondary. Prof.K.K.Achary,YRC
Research data is data that is collected, observed, or created, for purposes of analysis to produce original research results. • It could be primary or secondary. • Research data can be generated for different purposes and through different processes. • Can be divided into different types, depending on the study design. Prof.K.K.Achary,YRC
Observational: data captured in real-time, for example, sensor data, survey data, Ctscan/MRI images. • Experimental: data from lab equipment, often reproducible, but can be expensive. For example, gene sequences, chromatograms, • Simulation: data generated from test models . For example, climate models, economic models. • Reference data sets: collection of smaller (peer-reviewed) datasets, most probably published and archived. For example, gene sequence databanks, chemical structures, economic databases ,epidemiological databases etc.. Prof.K.K.Achary,YRC
Measure or observe the characteristics of interest like : • height , weight, gender , BP , sugar level, cholesterol level • patient’s condition during admission to ICU • pain level before and after treatment, • no. of days for recovery, • anesthesia dose etc., • family size, no. of siblings, family income etc. • The characteristics may be qualitative or quantitative in nature. Prof.K.K.Achary,YRC