Regression Statistical Procedure

In this Tutorial:

Classification techniques provide

unsupervised techniques for clustering cases (or variables) into small number of groups, each having similar characteristics based on variables (or cases).

Supervised techniques for classifying cases into a group of defined categories of a response variable of interest using a set of independent variables (inputs). SPSS only provides the traditional Discriminant analysis as a supervised classification technique under the Classification procedure. Another common classification technique is the Logistic regression technique. Various other supervised classification techniques, such as tree modeling, neural network, etc., are covered in a separate module that users need to purchase separately.

The following movie clip demonstrates how to conduct a cluster analysis using hierarchical cluster technique.

MOVIE: Cluster Analysis

For Logistic Regression, one may refer to the Regression Modeling Page.

	Different measurement scales among different variables have dramatic influence on the clusters. It is important to make some kind of standardization of the measurement scales.
	Outlying cases often dominate some clusters. It is important to take care of outliers.
	Cluster analysis depends on the distance measures used for clustering. It is important to identify proper distance measure for the problem of interest. SPSS has a set of defaults. If you do not know what would be appropriate, the default options are usually the more commonly used.
	The determination of the final number of clusters may differ from different criteria. It is a good idea to use several selection criteria to help you to choose the final number of clusters.
	The context behind the problem of study is an important consideration in the choice of clustering techniques and the criteria for selecting the number of clusters.

	Statistics- Agglomeration schedule shows the cases or clusters combined at each state. Proximity matrix shows the distances/similarities between items. You can also display cluster membership by requesting a single solution or a range of solutions.
	Plots- You can request dendrogram. By default, icicle of all clusters is displayed. You can turn this off. You can also select an orientation pattern.
	Method- Here, one selects the cluster method, select the type of measure, choose whether to transform data values. Transform measures allows you to transform the distance measure values that are generated.
	Save- Allows you to save cluster membership. This can be saved as a single solution or a range of solutions.

SPSS On-Line Training Workshop

Classification Techniques

	Options- This is where you select standardized variables and those to be standardized.
	Plots- Enables you to select plots.
	Output- Enables you to export final model.

	Iterate- Allows you to select options for the iteration algorithm.
	Save- Allows you to save results from the analysis.
	Options- Allows you to select some results for display.