|
| |
Project 2: Due Date
March 31 2008 [15-pages limit]
Discriminant analysis is used to statistically distinguish
between two or more groups of cases by weighting and linearly combining the
discriminating variables is some fashion so that the groups are forced to be as
statistically distinct as possible. This project is based on the 2003 crash
dataset and involves the use of discriminant analysis to answer the
following question:
Based on the data
available, can driver 1s be differentiated by age based on
characteristics of the crashes in which they were involved? The term,
characteristics of the crashes, is meant to include all of the data in the
record.
The following tasks should be part of the activities that
you execute to answer the above question:
- Compare the crash rate for different age groups [less
than 20 years, 21-25, 26-30, .., .., more that 90 years]
- Based on the above comparison, define the age groups
used in the study
- Recode nominal crash characteristics (variables) into
binary or interval format
- Use cross tabulation and correlation statistics, or
alternatively use on-way AVOA to examine the relationship between different
crash characteristics (variables) and different age groups
- Based on the cross tabulation/(ANOVA) analysis, choose
the appropriate crash characteristics (variables) to be included in the study
- Examine the correlation (interdependency) among
candidate variables
- Use the correlation tables to determine the variables to
be used in the study
- Discuss important features in discriminant analysis and
determine which features you will use in the study [Direct vs. Stepwise,
Bayesian Prior Probability, number of functions, etc.]
- Conduct discriminant analysis using SAS, SPSS, or any
other statistical software
- Present and discuss the output of the discriminant
analysis. Examine the within-group correlation and the possible use of
sequential models, modifying the age groups, or modifying the prior
probability to improve the discriminating power of the resulting functions
Download the data set: text format
SPSS format
Download the List of
variables for
the working file
|