Link to Table of Contents

We will cover:

Types of
Statistics

 Descriptive &
Graphical

Inferential

Variable
Relationships

Modeling a
Response

Factor Analysis

Nonparametric
Methods

  

 

 

 

Data Types

General speaking, statistical techniques are determined by the type of data. A basic understanding about the data types is helpful for choosing statistical procedures. In SPSS, a column is for a variable and a row is for a case. There are, generally speaking, two major types of data:

bullet

Qualitative variables: The data values are non-numeric categories.
Examples: Blood type, Gender.

bullet

Quantitative variables: The data values are counts or numerical measurements. A quantitative variable can be either discrete such as # of students receiving an 'A' in a class, or continuous such as GPA, salary and so on.

Another way of classifying data is by the measurement scales. In statistics, there are four generally used measurement scales:

bullet

Nominal data: data values are non-numeric group labels. For example, Gender variable can be defined as male = 0 and female =1.

bullet

Ordinal data (we sometimes call 'Discrete Data'): data values are categorical and may be ranked in some numerically meaningful way. For example, strongly disagree to strong agree may be defined as 1 to 5.

bullet

Continuous data:

bullet

Interval data : data values are ranged in a real interval, which can be as large as from negative infinity to positive infinity. The difference between two values are meaningful, however, the ratio of two interval data is not meaningful. For example temperature, IQ. Today is 1.2 times hotter than yesterday is not much useful nor meaningful.

bullet

Ratio data: Both difference and ratio of two values are meaningful. For example, salary, weight.

NOTE: The statistical procedures mentioned below are demonstrated using movie clips in the Statistical Procedures Page.

In this on-line workshop, you will find many movie clips. Each movie clip will demonstrate some specific usage of SPSS.

Statistics can be divided into these main areas:

bulletDescriptive statistics:
bulletSummary statistics:
bulletmean
bulletmedian
bulletstandard deviation
bulletpercentile
bulletfrequency
bulletSummary graphic tools
bulletpie charts
bullethistograms
bulletboxplots
bulletscatterplots
bulletInferential statistics: used to make comparisons between two or more groups or study relationships
bulletEstimation
bulletConfidence interval
bulletHypothesis testing

Descriptive and Graphical Analysis

bullet

For nominal data: Frequency, Crosstabs, bar charts and pie charts are common tools.

bullet

For ordinal data: Frequency, Crosstabs, and descriptive statistics, bar charts, pie charts, stem-leaf plots are common tools.

bullet

For continuous data: Descriptive statistics, histograms, boxplots, and scatterplots for two variables are common tools.

 

Inferential Analysis

If you are interested in comparing group effects.

bullet

For Nominal or ordinal data: Use crosstabs.

bullet

For continuous data:

bullet

First, check to see if the variable is normal. To check Normality, go to 'Analyze' then to 'Descriptive Statistics' then choose 'Explore' procedure.

bullet

Second, if you compare two or more groups, check the homogeneity of the variances among groups.  To do so, you also use the 'Explore' procedure.

bullet

For two group comparison, use Independent t-test.

bullet

For three or more group comparison, use one-way analysis of variance.

bullet

For two or more factors, use multiway analysis of variance.

bullet

If there are factors and covariates, use analysis of covariance.

bullet

If the same subject is measured more than one time, it is a repeated measure problem.

If you are interested in the relationship between two variables:

bullet

For nominal data, use crosstabs, and choose proper tests for nominal data.

bullet

For ordinal data, use crosstabs, bivariate correlation such as Spearman correlation coefficient..

bullet

For continuous data, use bivariate correlation such as Pearson correlation.

If you are interested in modeling a response (also called dependent variable) using predictor variables (also called independent variables):

bullet

For nominal data, if the response is a binary variable (that is only two possible values such as graduate in four years or not), then, use Logistic regression model.  If the response has more than two categories, use multinomial logistic regression.

bullet

For ordinal data, if the response follows Poisson distribution, use Poisson regression model.  In general, one can use log-linear models for ordinal data.

bullet

For continuous response, check the normality and homogeneity of variance, take appropriate data transformation as needed and use regression model.

If you are interested in reducing the data dimension, a typical procedure is the Factor Analysis.

Factor analysis combines similar variables together into a dimension that can be interpreted from the qualitative aspects of the study. In many survey studies, one may collect many variables. It is difficult to understand the overall meaning of these variables. Factor analysis helps to combine similar variables into the same dimension, and results to only a few dimensions (factors) that are meaningful for explaining the problem.

bullet

For example, in the technology survey, we collect 16 variables related to the difficulty faced by faculty when using classroom technology. Using Factor Analysis, we are able to combine these 16 different types of difficulties into four general groups of difficulty (factors). These are difficulties related to:
(1) Computer hardware and software,
(2) Media technology,
(3) Instructor's technology need in office, and
(4) Classroom instructional technology need.

Nonparametric Methods are another alternative:

If assumptions are violated for the statistical procedure that is chosen, there are many nonparametric statistical procedures that can do similar analysis that are less sensitive to the assumptions. The corresponding nonparametric procedures in SPSS include:

bullet

Chi-square

bullet

Two Independent Samples Comparison: The similar parametric procedure is independent t-test.

bullet

K Independent Samples Comparison : The similar parametric procedure is Analysis of Variance.

bullet

Two Related Samples: The similar parametric procedure is the paired t-test.

bullet

K Related Samples: The similar parametric procedure is the Repeated Measure Analysis.

To perform nonparametric statistical procedure in SPSS,

Go to 'Analyze', go down to 'Nonparametric Tests', then select the appropriate nonparametric procedure.

There are many more statistical procedures in SPSS. For example, Multivariate procedures, Survival Analysis, Discriminant Analysis, and so on. We are not able to cover all of these procedures in this workshop. The bottom line is, when you have questions about your design and analysis, contact a statistical consultant for help.