250 likes | 297 Views
TransCAD. Tabulations and Statistics. Tabulations and Statistics. Many applications that utilize geographic and tabular data call for computing measures of associations and relationships among entities. For each numeric field in the dataview, TransCAD finds:
E N D
TransCAD Tabulations and Statistics
Tabulations and Statistics • Many applications that utilize geographic and tabular data call for computing measures of associations and relationships among entities. • For each numeric field in the dataview, TransCAD finds: • The number of records with a value for the field • The sum of all the values • The highest value • The lowest value • The average value • The standard deviation
Tabulations and Cross-Tabulations • TransCAD creates both one-way and two-way tabulations of data stored in any dataview. A tabulation counts the number of records with certain values for one or more data fields. • Tabulations and cross-tabulations are excellent means of analyzing and comparing data. By interactively selecting the break points, you can explore the shape of the distributions of values for a single variable, or generate hypotheses concerning the relationship between any two variables.
Correlation Matrices • The correlation is a measure of the amount of correspondence between the values of two fields in a map layer or dataview. A large positive correlation indicates that the values of the two fields tend to increase or decrease together. A large negative correlation indicates that the value of one field tends to increase when the other decreases, and vice versa. The correlation measure is scaled so that the values are always between –1 and 1.
Correlation Matrices • A correlation of exactly 1 (or –1) indicates that the two fields are linearly dependent: for example, doubling the value of one field implies a doubling (or halving) of the value of the other. A small correlation (exactly zero is very rare) indicates that there is no predictable linear relation between the values of the two fields. In this case the fields are said to be linearly independent. • The correlation matrix can be used to examine the inputs to a linear model, where a variable that is highly correlated with another does not contribute any new information to the model and interferes with the ability of the model to discern the relative importance of variables.
Multiple Linear Regression Models • Regression is perhaps the most widely used statistical tool for determining relationships among data fields. You use the regression model to answer questions like these: • How does the number of home-based work trips vary based on household size, income, auto ownership, and retail employment? • How dose employment relate to the population in several age categories? • How does retail market potential relate to average household income, population by race, and occupation classification?
Binary Logit Model • The binary logit model is used to model 0, 1 dependent variables. For this reason, it is often used to describe the relationship between a choice of two alternatives and other variables on which the choice may depend. You use the binary logit model to answer questions like the following: • How is the decision to use transit influenced by transit fare, the difference between transit and auto travel time and the difference between the number of licensed drivers and the number of autos? • How is the choice to go to college or not influenced by family characteristics?
Estimating a Model • To estimate one of these econometric models, you choose the dependent variable, one or more independent variables, and the set of records on which the model should be estimated. • Models can be estimated on any map layer or dataview. You can choose to estimate the model parameters using all of the records in the layer or dataview, or only those records in a selection set. • All of the model estimation procedures produce two output files: a formatted report of the estimation results including all goodness-of-fit and importance measures, and a file containing the values of the the estimated model parameters.
Evaluating a Model • The two statistical model estimation routines produce a text file, called the model file, with information that can be used to evaluate the model on a data set having similar fields. To use the evaluation procedure, you choose the subset of records to be evaluated and the field that will receive the predicted dependent variable. You can also choose the fields that contain the values of the independent variables, and make adjustments to the model parameters.
Spatial Statistics • When the observations in the sample are areas that make up a region, there is a tendency for adjacent areas to have correlated values. This happens because the distribution of area attributes (such as income or household size) tend to vary relatively slowly across the region. This means that you are likely to find higher values in areas that are near other areas with high values, and vice versa. This effect is known as spatial autocorrelation.