180 likes | 188 Views
The previous had been stochastics 3 normal distibution, standard deviation, measured data, gaussian square root n – law confidence interval of measured data error probability, p-value, signifcance level. W.-Rechner. Seminarplan Stochastik 3. Folie 1.
E N D
The previous had been stochastics 3 • normal distibution, • standard deviation, measured data, • gaussian square root n – law • confidence interval of measured data • error probability, p-value, signifcance level . W.-Rechner Seminarplan Stochastik 3 Folie 1 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
The english translation. • Overview: How to do stochastics • how to write measured data • hypothesis testing with measured data • regression, correlation • elements of descritive statistics • more distributions, • empirical research W.-Rechner Seminarplan Stochastik 4 Stochastics is the superordinate concept of descriptive statistics, theorie of probability, infererence statistics www.mathematik-sehen-und-verstehen.de Folie 2 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
Mathix supposes that his measuring device shows too small values. He planes a measurement with a physical quantity wellknown as 20 mA with sigma=1.6 mA . His result are n=4 values xi={18,19,17,18} mA . Calculate the messured value in the demanded form. Has his measuring device significantly lower values? hypothesistestingwithmeasuredvalues Demandedaretheelementsshown on thefollowingside. Folie 3 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
The test is one sided at the left, because he suppose so befor the measuring. distribution of single values hypothesistestingwithmeasuredvalues measured value distibution of such means perpetuationregionfor Ho mean citical region If the mean is in the critical region you must accept H1 and reject H0 standard error Folie 4 Ifthemeanis in theperpetuationregion, youcannotaccept H1. Nothingispredictable.
Regression Given are measured points. The goal ist „best“ straight line through the point cloud. Shown are in brown the error squares, also called residue squares. In blue is their sum at the left. This must be minimal. Folie 5 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
Regression There are two parameters m and k for the line, so the sum of the residue squares is a function of two variables, shown as a 3d-area. The minimal point of this ist the goal. in Optimierung S. 208 ff und Stochastik S. 259 Often it is possible to find the regression line with visual judgement. Other regression curves are possible. In Excel and GeoGebra this is called trendcurve. Folie 6 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
The parabolas are the same as shown in 3d-areas. Regression The regression line At the left are the x-variance, y-variance and the mixed variance Folie 7 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
Regression, correlationcoefficient the standard deviations are correlation coefficient measured points exact on a straight line strong corellation nearly not any correlation strong corellation strong corellation weak correlation r=0.974 r=0.674 r= - 0.968 Folie 8 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
this is the best, but 2006 must be more at the right. desciptivestatisticswrongpresentaions income, men and women Figure b) is wrong because the y-axis begins at 1500€. So it pretends a smaller ratio of the incomes, as it is in reality. Stochastik S. 258 Figure c) is wrong because one cannot see the income of women directly. The figure is able to show the income of families with two erarners. Folie 9 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
income, men and women the same as previous desciptivestatisticswrongpresentaions d) Here are calulated third roots with the values shown in a). If the income is in Euro- coins the cube have exactliy the correct volume. This figure is correct, but the users of excel don‘t so. Figure e) is wrong because the user of excel has taken the Values shown in a) only as edge length. So the shown volumes are incorrect. Reflect: a cube with the half edge length has only the eighth of the volume. Figure e) is wrong due to the same reason. For the icosahedrons the effect is yet more obvious. Mark: in the sight of science 3d can be risky. Folie 10 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
Four cell test according to Fisher Nevertheless there are more cells, but only four inner cells. Thera are two groups A and B with a and b elements. There are e elements with the feature E and ne with nE, not E. The table shoes the distibution. Tableswithfourcells not forthe Klausur If the ratio , the other suitable ratios are the same too. Then the groups are not distinctable in respect to E. null hypothesis H0: The groups are not distinctable in respect to E research hypothesis H1: The groups are distinctable in respect to E, group B has significant less E than A (in the upper case). Folie 11 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
not forthe Klausur Beispiel A arethestudentswhich do theexercises. B arethestudentswhichpassedtheexam. Vierfeldertafeln Folie 12 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
not forthe Klausur Dieses Beispiel hatte ich vorbereitet, musste es dann aber weglassen. Daher ist es nun nicht klausurrelevant. Es ist aber so interessant und wichtig für Lebenspraxis, dass ich es nicht weglassen möchte. Situation: Mathilde geht zur Vorsorgeuntersuchung. Es geht um eine Krankheit K. Der Test fällt positiv aus, T+. Das heißt aber nicht, dass Mathilde die Krankheit wirklich hat. Wie groß ist die Wahrscheinlichkeit, dass sie trotz T+ gesund ist? Beispiel aus Sachs,Hedderich:Angewandte Statistik, Springer 2006 S. 135 Vierfeldertafeln Bekannt ist die Spezifität des Testes, die Wahrscheinlichkeit, dass ein Gesunder doch T- erhält. Das ist P(T-| n K)=94% Damit kann man in dieser Tabelle alle leeren Plätze füllen. Zuerst den freien Platz rechts 10000-150=9850, dann (n K, T-)=0.94*9850=9259. Der Rest ergibt sich durch Ergänzungen. Dann kann man die Sensitivität des Testes. ausrechnen P(T+| K)=130:150=86,7%, die W., dass ein Kranker T+ bekommt. Mit Sensitivität und Spezifität werden richtige Entscheidungen be- schrieben. Mathilde hofft, in dem Feld mit der 591 zu sein, in dem die Gesunden sind, die T+ hatten. Die W. für ein falsch-positves Erg. ist P(K|T+)=591:721=82%. Mathilde wartet mit Gelassenheit auf weitere Tests. Oft denkt man nicht an die Prävalenz P(K)=0.0150. Folie 13 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
nominal, qualitative data color of hairs, religion, country of origin, family status • ordinal data, range data Data types, typesoffeatures your can order reasonable: school grades, grades of comliance score in contests, grades of difficulties in ski-runs, creditpoints • metric data, measured data • interval data • ratio data dimensions without a natural zero, i.e. temprature, „double“ is not possible dimensions with a natural zero, a ratio is meaningful. i.e. mass, length, time, number of successes , „double“ is meaningful metric data are discret or continuous Folie 14 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
https://en.wikipedia.org/wiki/Benford's_law History: In 1881 newcomb discoverd by looking on the logarithmic table book in the libraris, that pages of the lower first digits are more used the others. Benford‘sdistibution Newcomb published this and deduced a logarithmic formula. But there was no attention on his work. In 1938 the physicist Frank Benford rediscoverd the law and shows much data for it. It did not proof this. Folie 15 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
Not until 1995 the mathematician Theodore Hill disdovered more and prooved more connections. The mathematician Mark Negrini created an computer program for analysis of data to proof the genuiness of data, which shall be „Benford-ditributed“. That are mainly data out of exponential contextes, but aggregated data, which are not even benford-didtributed, follows the Benford-distribution. In this way it is possible to discover and proof deception in business und banking data, scientistic data and so on. Benford‘sdistribution lg= logatithm with base 10 Benford‘s law The probabilty that the first digit is x is given by Folie 16 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
Overview: How to do stochastics • regression, korrelation • elements of descritive statistics • more distributions, • empirical research W.-Rechner Seminarplan Stochastik 4 stochastics is the superordinate concept of descriptive statistics, theorie of probability, infererence statistics www.mathematik-sehen-und-verstehen.de Folie 17 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de
Stochastik I hope, ithadbeengoodforyou! Vorlesung in vier Teilen im Rahmen von Mathematik für alle, Leuphanasemester Folie 18 Prof. Dr. Dörte Haftendorn, Leuphana Universität Lüneburg, 2015 http://www.mathematik-sehen-und-verstehen.de