SPSS On-Line Training Workshop

HOME Table of Contents Data Editor Window Syntax  Editor Window Link to Contact-Us

Carl Lee
Felix Famoye

About Us 

Chart Editor Window Output Window  Overview of Data Analysis Manipulation of Data
Analysis of Data Projects & Data Sets Integrate R into SPSS  
Link to Table of Contents


Valid data
analysis needs:

solid planning

valid data

correct analysis


Data Types
and Analysis


General Considerations
There is no best way to conduct a quantitative study. Different projects involve different considerations of the contexts behind the study. Without proper understanding of the contexts of the study that are associated with the project, the quantitative study will be purely empirical. The empirical results may not be able to answer the root causes of the problem. Hence, it is crucial to thoroughly investigate the context behind the project before a proper plan and design of a quantitative study is conducted. The common aspects  related to the contexts behind the study other than the intended quantitative measurements that need to be addressed may include, but not limited to:


External environmental conditions


Background of the subjects


Possible factors associated with the intended measurement.


Common sense and logic

Two Examples that did not take into account the context behind the study


Example One: A study on children's' mathematics ability concluded that the larger the foot size, the smarter the child is, because these are highly correlated.


Example Two: In a small city, the population size of the city is predicted by the number of storks in the city, since the regression model suggests that number of storks is accurate in predicting the population size along the years.

These two examples ignore the context behind of the study. As the result, the conclusion goes against common logic.

The problem with Example One is that the root cause of mathematics ability for young children is associated with their age. It happens that foot size is larger when they are older.

Example Two misuses the response and the predictor. The situation of the environment is that when population increased, more houses were built, and therefore, attracted more storks moving into the city.

These examples suggest that without proper consideration of the context behind the study, it could easily be 'Garbage In, Garbage Out'.

Data Analysis

A valid data analysis starts with a solid planning of the study.


For a survey study or an observational study where a controlled experiment can not be conducted,   one should begin by considering the adequate measurement, the target population, sampling techniques, sample size, factors associated with the intended characteristics, designing questionnaires, and ways of distributing and collecting the survey.


For a controlled experimental study, one should begin by considering the measurement, the potential confounding factors associated with the measurement, the intended factors for the experiment, the design of the experiment, experimental units, sample size, and possible statistical techniques for analysis based on the experimental design.


In many situations, a controlled experiment may not be possible. However, a semi-experimental design may be possible. For these situations, the background and environmental factors are extremely important. For example, in studying the effect of different teaching pedagogical approaches, one may not be able to perform a randomization scheme to select subjects for each teaching pedagogy. It may happen that one class has much better students than the other class. Hence, the effectiveness of the teaching method is confounded with students' initial ability. If we collect the information of potential confounding factors such as their GPA,  gender, and age, and conduct a pre-test, then a proper data analysis such as Analysis of Covariance or Repeated Measure Analysis can be performed to make a proper comparison.

A valid data analysis must have a valid set of data.


Once data are collected, the next step is to design a proper format for data entry. Many times data are entered in such a way that it is not readable by statistical software. Although computer technology is very advanced, data values can only be either numeric or non-numeric. Numeric values can be quantified; while non-numeric values can only be summarized in most cases. It is important that proper data values be created so that statistical software can perform the analysis.


After data entry, it is the data cleaning and manipulation stage. It can happen that some data points are entered completely out of range. A quick way of locating these out-of-range data values is by performing frequency procedures or descriptive procedures, and check the output results to see if any variable has such a problem.


Data transformation is often used before a valid analysis can be performed.

Appropriate Statistical Procedures are the key to a correct analysis.


Almost every statistical procedure has assumptions behind it. It is necessary to carefully consider the violation of the assumptions for a statistical procedure. A minor violation usually does not create serious problems. However, if there is a serious violation, appropriate data transformation or selecting different statistical procedures may be necessary.


It is often the case that appropriate statistical procedures are associated with the types of data. Categorical data needs to be analyzed using procedures that are developed for analyzing categorical data. We do not perform frequency analysis or crosstabulation procedures to analyze continuous data. More detailed discussion is given in the Data Type and Possible Analysis Section.


It happens often in data analysis that one needs to conduct several analyses before an appropriate one is selected. One should expect that the analysis is never only a one step process. It involves many back and forth analyses and decisions for a proper analysis.

Appropriate analysis needs correct interpretation of the results.


How to interpret and summarize the results from a huge pile of output is certainly a crucial step for a valid data analysis. It involves the understanding of the project, the statistical techniques and how to bring the numbers into the context of the project.


One must make sure that the output is properly interpreted and summarized to a degree that non-statisticians can understand them.

Data Types and Analysis

Generally speaking, statistical techniques are often determined based on the type of data. The Data Type and Analysis page provides some details regarding to different types of data and possible statistical techniques for analysis.  

Bottom line is

If you are not familiar with any step described above, seek statistical consulting help.    

horizontal rule

Navigation for Home,  Tutorials and Contact Us

This online SPSS Training Workshop is developed by Dr Carl Lee, Dr Felix Famoye , student assistants Barbara Shelden and Albert Brown , Department of Mathematics, Central Michigan  University. All rights reserved.