SPSS Shapiro-Wilk Test - The Ultimate Guide (2024)

  • Shapiro-Wilk Test - What is It?
  • Shapiro-Wilk Test - Null Hypothesis
  • Running the Shapiro-Wilk Test in SPSS
  • Shapiro-Wilk Test - Interpretation
  • Reporting a Shapiro-Wilk Test in APA style

Shapiro-Wilk Test - What is It?

The Shapiro-Wilk test examines if a variable
is normally distributed in some population.
Like so, the Shapiro-Wilk serves the exact same purpose as the Kolmogorov-Smirnov test. Some statisticians claim the latter is worse due to its lower statistical power. Others disagree.

As an example of a Shapiro-Wilk test, let's say a scientist claims that the reaction times of all people -a population- on some task are normally distributed. He draws a random sample of N = 233 people and measures their reaction times. A histogram of the results is shown below.

SPSS Shapiro-Wilk Test - The Ultimate Guide (1)

This frequency distribution seems somewhat bimodal. Other than that, it looks reasonably -but not exactly- normal. However, sample outcomes usually differ from their population counterparts. The big question is:how likely is the observed distribution if the reaction times
are exactly normally distributed in the entire population?
The Shapiro-Wilk test answers precisely that.

How Does the Shapiro-Wilk Test Work?

A technically correct explanation is given on this Wikipedia page. However, a simpler -but not technically correct- explanation is this: the Shapiro-Wilk test first quantifies the similarity between the observed and normal distributions as a single number: it superimposes a normal curve over the observed distribution as shown below. It then computes which percentage of our sample overlaps with it: a similarity percentage.

How to check normality of a variabl...

How to check normality of a variable | Stata Tutorial

SPSS Shapiro-Wilk Test - The Ultimate Guide (2)

Finally, the Shapiro-Wilk test computes the probability of finding this observed -or a smaller- similarity percentage. It does so under the assumption that the population distribution is exactly normal: the null hypothesis.

Shapiro-Wilk Test - Null Hypothesis

The null hypothesis for the Shapiro-Wilk test is that a variable is normally distributed in some population.A different way to say the same is that a variable’s values are a simple random sample from a normal distribution. As a rule of thumb, wereject the null hypothesis if p < 0.05.So in this case we conclude that our variable is not normally distributed.
Why? Well, p is basically the probability of finding our data if the null hypothesis is true. If this probability is (very) small -but we found our data anyway- then the null hypothesis was probably wrong.

Shapiro-Wilk Test - SPSS Example Data

A sample of N = 236 people completed a number of speedtasks. Their reaction times are in speedtasks.sav, partly shown below. We'll only use the first five trials in variables r01 through r05.

SPSS Shapiro-Wilk Test - The Ultimate Guide (3)

I recommend you always thoroughly inspect all variables you'd like to analyze. Since our reaction times in milliseconds are quantitative variables, we'll run some quick histograms over them. I prefer doing so from the short syntax below. Easier -but slower- methods are covered in Creating Histograms in SPSS.

*Quick histograms with normal curves as data check.

frequencies r01 to r05
/format notable
/histogram normal.

Results

Note that some of the 5 histograms look messed up. Some data seem corrupted and had better not be seriously analyzed. An exception is trial 4 (shown below) which looks plausible -even reasonably normally distributed.

SPSS Shapiro-Wilk Test - The Ultimate Guide (4)

Descriptive Statistics - Skewness & Kurtosis

If you're reading this to complete some assignment, you're probably asked to report some descriptive statistics for some variables. These often include the median, standard deviation, skewness and kurtosis. Why? Well, for a normal distribution,

  • skewness = 0: it's absolutely symmetrical and
  • kurtosis = 0 too: it's neither peaked (“leptokurtic”) nor flattened (“platykurtic”).

So if we sample many values from such a distribution, the resulting variable should have both skewness and kurtosis close to zero. You can get such statistics from FREQUENCIES but I prefer using MEANS: it results in the best table format and its syntax is short and simple.

*Descriptives table.

means r01 to r05
/cells count mean median stddev skew kurt.

*Optionally: transpose table (requires SPSS 22 or higher).

output modify
/select tables
/if instances = last /*process last table in output, whatever it is...
/table transpose = yes.

Results

SPSS Shapiro-Wilk Test - The Ultimate Guide (5)

Trials 2, 3 and 5 all have a huge skewness and/or kurtosis. This suggests that they are not normally distributed in the entire population. Skewness and kurtosis are closer to zero for trials 1 and 4.
So now that we've a basic idea what our data look like, let's proceed with the actual test.

Running the Shapiro-Wilk Test in SPSS

The screenshots below guide you through running a Shapiro-Wilk test correctly in SPSS. We'll add the resulting syntax as well.

SPSS Shapiro-Wilk Test - The Ultimate Guide (6) SPSS Shapiro-Wilk Test - The Ultimate Guide (7)

Following these screenshots results in the syntax below.

*Shapiro-Wilk test pasted from Analyze - Descriptive Statistics - Explore...

EXAMINE VARIABLES=r01 r02 r03 r04 r05
/PLOT BOXPLOT NPPLOT
/COMPARE GROUPS
/STATISTICS DESCRIPTIVES
/CINTERVAL 95
/MISSING PAIRWISE /*IMPORTANT!
/NOTOTAL.

Running this syntax creates a bunch of output. However, the one table we're looking for -“Tests of Normality”- is shown below.

Shapiro-Wilk Test - Interpretation

SPSS Shapiro-Wilk Test - The Ultimate Guide (8)

We reject the null hypotheses of normal population distributions
for trials 1, 2, 3 and 5 at α = 0.05.
“Sig.” or p is the probability of finding the observed -or a larger- deviation from normality in our sample if the distribution is exactly normal in our population. If trial 1 is normally distributed in the population, there's a mere 0.01 -or 1%- chance of finding these sample data. These values are unlikely to have been sampled from a normal distribution. So the population distribution probably wasn't normal after all.

We therefore reject this null hypothesis. Conclusion: trials 1, 2, 3 and 5 are probably not normally distributed in the population.

The only exception is trial 4: if this variable is normally distributed in the population, there's a 0.075 -or 7.5%- chance of finding the nonnormality observed in our data. That is, there's a reasonable chance that this nonnormality is solely due to sampling error. Sofor trial 4, we retain the null hypothesis
of population normality because p > 0.05.
We can't tell for sure if the population distribution is normal. But given these data, we'll believe it. For now anyway.

Reporting a Shapiro-Wilk Test in APA style

For reporting a Shapiro-Wilk test in APA style, we include 3 numbers:

  • the test statistic W -mislabeled “Statistic” in SPSS;
  • its associated df -short for degrees of freedom and
  • its significance level p -labeled “Sig.” in SPSS.

The screenshot shows how to put these numbers together for trial 1.

SPSS Shapiro-Wilk Test - The Ultimate Guide (9)

Limited Usefulness of Normality Tests

The Shapiro-Wilk and Kolmogorov-Smirnov test both examine if a variable is normally distributed in some population. But why even bother? Well, that's because many statistical tests -including ANOVA, t-tests and regression- require the normality assumption: variables must be normally distributed in the population. However,the normality assumption is only needed for small sample sizesof -say- N ≤ 20 or so. For larger sample sizes, the sampling distribution of the mean is always normal, regardless how values are distributed in the population. This phenomenon is known as the central limit theorem. And the consequence is that many test results are unaffected by even severe violations of normality.
So if sample sizes are reasonable, normality tests are often pointless. Sadly, few statistics instructors seem to be aware of this and still bother students with such tests. And that's why I wrote this tutorial anyway.

Hey! But what if sample sizes are small, say N < 20 or so? Well, in that case, many tests do require normally distributed variables. However, normality tests typically have low power in small sample sizes.As a consequence, even substantial deviations from normality may not be statistically significant. So when you really need normality, normality tests are unlikely to detect that it's actually violated. Which renders them pretty useless.

Thanks for reading.

SPSS Shapiro-Wilk Test - The Ultimate Guide (2024)

FAQs

How do you interpret the Shapiro-Wilk test in SPSS? ›

How do we know this? If the Sig. value of the Shapiro-Wilk Test is greater than 0.05, the data is normal. If it is below 0.05, the data significantly deviate from a normal distribution.

What are the limitations of the Shapiro-Wilk test? ›

The Shapiro-Wilk test is not especially sensitive to outliers. There are normality tests that focus on outliers, by looking at a combination of skewness and kurtosis, but they are different.

How many observations are needed for the Shapiro-Wilk test? ›

StatsDirect requires a random sample of between 3 and 2,000 for the Shapiro-Wilk test, or between 5 and 5,000 for the Shapiro-Francia test. The omnibus chi-square test can be used with larger samples but requires a minimum of 8 observations.

How do you know if Shapiro Wilk is significant? ›

Shapiro-Wilk Test for Normality

If the test is non-significant (p>. 05) it tells us that the distribution of the sample is not significantly different from a normal distribution. If, however, the test is significant (p < . 05) then the distribution in question is significantly different from a normal distribution.

What is P 0.05 in Shapiro-Wilk test? ›

If the chosen alpha level is 0.05 and the p-value is less than 0.05, then the null hypothesis that the data are normally distributed is rejected. If the p-value is greater than 0.05, then the null hypothesis is not rejected.

What is the p-value for shapiro wilk significance? ›

The Shapiro-Wilks test for normality is one of three general normality tests designed to detect all departures from normality. It is comparable in power to the other two tests. The test rejects the hypothesis of normality when the p-value is less than or equal to 0.05.

Is Shapiro-Wilk test reliable? ›

The Shapiro-Wilk test is indeed often commended, but it can't tell you exactly how your data differ from a normal. Often unimportant differences are flagged by the test, because they do qualify as significant for large sample sizes, and the opposite problem can also bite you.

What sample size is needed for Shapiro-Wilk test for normality? ›

The Shapiro–Wilk test is more appropriate method for small sample sizes (<50 samples) although it can also be handling on larger sample size while Kolmogorov–Smirnov test is used for n ≥50. For both of the above tests, null hypothesis states that data are taken from normal distributed population.

Does Shapiro-Wilk test assume normality? ›

Another formal test of the assumption of normality that we recommend for general use is the Shapiro–Wilk (S–W) test (Shapiro and Wilk, 1965).

What is the minimum n for the Shapiro-Wilk test? ›

The Shapiro Wilk test applies at any sample size above n=2 (it can't work at n=1 or n=2).

What is a large sample size for Shapiro Wilk? ›

I hope it helps... The Shapiro-Wilk Test is more appropriate for small sample sizes (< 50 samples), but can also handle sample sizes as large as 2000. The normality tests are sensitive to sample sizes. I personally recommend Kolmogorov Smirnoff for sample sizes above 30 and Shapiro Wilk for sample sizes below 30.

Is Shapiro-Wilk test sensitive to sample size? ›

Shapiro-Wilk test efficiency depends on soybean sample size. A low number of sampled plants generates a bias in the error normality analysis. Methodologies for defining the optimal sample size presented distinct results. The perpendicular distances and linear plateau methods are recommended.

How do you check if data is normally distributed in SPSS? ›

Testing for normality
  1. Analyze.
  2. Descriptive Statistics.
  3. Explore…
  4. move the variable that you are checking for normality (in this case 'q6') into the Dependent List.
  5. click on the Plots… button.
  6. tick the Histogram box (keep the Stem-and-leaf box ticked as well)
  7. tick the Normality plots with tests option.
  8. click on Continue.

How do I know if my data is normally distributed? ›

A histogram is an effective way to tell if a frequency distribution appears to have a normal distribution. Plot a histogram and look at the shape of the bars. If the bars roughly follow a symmetrical bell or hill shape, like the example below, then the distribution is approximately normally distributed.

What if my data is not normally distributed? ›

If your data is not normal, you may try to transform or normalize it to make it more normal. Transformation is the process of applying a mathematical function to your data, such as log, square root, or inverse, to change its shape and reduce its skewness or outliers.

Is Shapiro Wilk too sensitive? ›

The Shapiro-Wilk test for non-normality is highly sensitive to the presence of ties due to grouping or rounding of the raw data, and should not be used if the grouping interval exceeds 0.1 standard deviation units.

What is the difference between Pearson and Shapiro-Wilk test? ›

Unlike the D'Agostino-Pearson test, the Shapiro-Wilk test doesn't use the shape of the distribution to determine whether or not it is normal. Instead, it compares the actual SD of the data to the SD computed from the slope of the QQ plot for the data, and calculates their ratio.

Is the Shapiro Wilk normality test parametric or nonparametric? ›

The Shapiro–Wilk test, which is a well-known nonparametric test for evaluating whether the observations deviate from the normal curve, yields a value equal to 0.894 (P < 0.000); thus, the hypothesis of normality is rejected.

Top Articles
Latest Posts
Article information

Author: Kimberely Baumbach CPA

Last Updated:

Views: 5961

Rating: 4 / 5 (41 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Kimberely Baumbach CPA

Birthday: 1996-01-14

Address: 8381 Boyce Course, Imeldachester, ND 74681

Phone: +3571286597580

Job: Product Banking Analyst

Hobby: Cosplaying, Inline skating, Amateur radio, Baton twirling, Mountaineering, Flying, Archery

Introduction: My name is Kimberely Baumbach CPA, I am a gorgeous, bright, charming, encouraging, zealous, lively, good person who loves writing and wants to share my knowledge and understanding with you.