Research note: Multi-group comparison in partial least squares (PLS) path modelling

This blog post is about the considerations I made while searching for a method to compare two studies that use the same causal model. Finally, I show how I applied the non-parametric confidence set approach to compare two groups.

Last year, I published a study about students’ intentions to use wikis in higher education. In this study, a survey was conducted with 133 first semester students to test a causal model based on the Decomposed Theory of Planned Behaviour (DTPB). A while ago, I conducted this survey with 35 graduate students. The results of the second study differed from the hypothesised causal relationship, that have been proven in the first study, and were in no accordance with expectations. For this reason, I searched for methods to investigate differences between both studies.

Moderation vs. mediation

First of all, I asked myself whether I wanted to examine mediation or moderation or both at once. Particularly helpful was the article by Henseler and Fassott [12], which provides a good overview on this topic. It became clear to me that I want to compare moderating effects through group comparison using study experience as a categorical variable, where group differences become apparent as differences in parameter estimates – as it is obviously the case in my studies. In partial least squares (PLS) path modelling, this is referred to as multi-group comparison/analysis (often abbreviated as PLS-MGA).

Measurement invariance testing

A prerequisite for multi-group comparison is measurement invariance [9, 10], while “it is often assumed that the measurement invariance is given if you use the same items for the latent variables measurement in each group” (citing a post of the SmartPLS forum). Following [9, 10], testing measurement invariance is pretty new in PLS and seldom seen in PLS research, at least at the moment. However, I decided to believe in the above stated assumption and did not tested for measurement invariance.

Nota bene: According to my literature review: [4] provides a promising method to examine measurement invariance as well as [11] introduces an example application.

Multi-group comparison

However, there are several ways to conduct a multi-group comparison of PLS models between two (!) groups (see [5] for an approach to compare more than two groups). Namely,
(1) the parametric approach, or Keil/Chin-approach [1, 2],
(2) the Smith-Satterthwaite test [3],
(3) the permutation-based approach [4],
(4) the non-parametric approach [3], or Henseler’s PLS multi-group analysis, and
(5) the non-parametric confidence set approach [5].

(1) The parametric approach
The approach requires normally distributed data, which “runs contrary to PLS path modelling’s distribution-free character” [5, p. 200]. However, the analysis is carried out by

  • running the PLS path modelling algorithm as well as the bootstrapping procedure on both groups,
  • obtaining the standard errors of the group-specific parameter estimates, and
  • running the test statistic provided by [1] when parameter estimates’ standard deviations are equal (use Levene’s test to compute significance of difference). [6] provides an Excel file that can be used to calculate the test statistic; it is provided as an additional download to the book. An video illustrating this approach using PLS Graph can be found on YouTube.

(2) Smith-Sattherthwaite test
This test also requires normally distributed data, but can be used when the parameter estimates’ standard deviations are not equal. The test statistic provided by Chin [2] was not entirely correct and has been corrected by Nitzl [7 in German; but Nitzl’s equation is also provided in 5].

(3) The permutation-based approach
The approach was developed by Chin in 2003 and is further described by Chin and Dibbern [4]. A detailed step-by-step guide can be found in [4] and in [5]. Although this approach does not rely on any distributional assumptions, it requires groups’ sample sizes to be fairly similar [5].

(4) The non-parametric approach
This approach is similar to (1) with the difference that “bootstrap estimates are used to assess the robustness of group-specific parameter estimates” [5, p. 202]. As a consequence, this approach does not build on any distributional assumptions, but can only be used to test one-sided hypotheses. The procedure is described in detail in [8]. An Excel file implementing this approach can be obtained from the first author of [13] upon request, but is also available here (last visit: 2014-04-06).

(5) The non-parametric confidence set approach
This approach does not build on any distributional assumptions and can be used to test for significant difference between two groups. The procedure is detailed in [5]. It conceptually builds on [2], but uses the group-specific bootstrap intervals for comparison. The analysis is carried out by

  • running the PLS path modelling algorithm for both groups,
  • constructing an alpha-percent-bootstrap confidence interval for both groups, and by
  • checking whether the parameter estimates of one group one fall into the confidence interval of group two and vice versa. If they do not overlap, one can assume that they are significantly different.

I selected the non-parametric confidence set approach as it does not build on any distributional assumption unlike (1) and (2), allows different groups’ sample sizes unlike (3), and can detect significant differences that are lower or higher unlike (4).

Multi-group comparison using SmartPLS and the R package semPLS

The non-parametric confidence set approach requires an alpha-percent-bootstrap confidence interval to be constructed that cannot be calculated in SmartPLS. However, I found that the R package semPLS can read SmartPLS files (ending with *.splsm) and can be used to calculate such a confidence interval. The syntax is the following

Using the output provided by the summary function you will be able to examine whether the differences between groups’ parameter estimates are significant. Therefore, you have to

  1. calculate the corrected confidence interval for both groups,
  2. copy the last lines from the output provided from the summary function that are labelled with beta_x_y (use the output from the pathCoeff function to identify the corresponding relationships) for both groups,
  3. paste the output in an Excel spreadsheet, and
  4. check whether the path coefficients of one group fall in the corresponding interval (column lower and upper) of the other group and vice versa. If the path coefficient overlaps with the confidence interval, one can assume that it is not significant different from the other group.
# install semPLS package or uncomment in case you have installed it already
install.packages('semPLS')
install.packages('XML')
# load the semPLS package
library(semPLS)
# load the data from *.csv-file
Data <- read.csv(file.choose(), sep=";")
# load the SmartPLS model that is located in your SmartPLS workspace
Model <- read.splsm(file.choose(), order = "generic")
# run PLS algorithm to calculate the model
LV <- sempls(model = Model, data = Data, wscheme = "pathWeighting")
# print the path coefficients
pathCoeff(Intention)
# bootstrap the semPLS object
set.seed(123)
Bootstrap <- bootsempls(LV, nboot = 5000, start= "ones", verbose=TRUE)
# construct the corrected confidence interval
summary(Bootstrap, type = "perc", level = 0.95)

You will get a table that looks somehow similar to mine (see Figure 1). In this example, the multi-group comparison shows no significant differences between both groups as nearly every path coefficient overlaps the confidence interval of the other group (e.g., the path coefficient of the second group for the causal relationship FC –> PBC lies with the confidence interval of the first group: 0.1692 < 0.2000 < 0.5014).

PLS-MGA
Figure 1: PLS multi-group analysis

References

  1. Chin, W. W. (2000). Frequently asked questions: Partial least squares & PLS-Graph. Retrieved from http://disc-nt.cba.uh.edu/chin/plsfaq.htm
  2. Keil, M., Tan, B. C. Y., Wei, K.-K., Saarinen, T., Tuunainen, V., & Wassenaar, A. (2000). A cross-cultural study on escalation of commitment behavior in software projects. MIS Quarterly, 24(2), 299–325.
  3. Henseler, J. (2012). PLS-MGA: A non-parametric approach to partial least squares-based multi-group analysis. In W. A. Gaul, A. Geyer-Schulz, L. Schmidt-Thieme, & J. Kunze (Eds.), Challenges at the interface of data analysis, computer science, and optimization (pp. 495–501). Berlin: Springer. doi:10.1007/978-3-642-24466-7
  4. Chin, W. W., & Dibbern, J. (2010). An Introduction to a Permutation Based Procedure for Multi-Group PLS Analysis: Results of Tests of Differences on Simulated Data and a Cross Cultural Analysis of the Sourcing of Information System Services Between Germany and the USA. In V. Esposito Vinzi, W. W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of partial least squares (pp. 171–193). Berlin: Springer. doi:10.1007/978-3-540-32827-8_8
  5. Sarstedt, M., Henseler, J., & Ringle, C. M. (2011). Multigroup analysis in partial least squares (PLS) path modeling: Alternative methods and empirical results. Advances in International Marketing, 22, 195–218. doi:10.1108/S1474-7979(2011)0000022012
  6. Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M. (2014). A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM). Thousand Oaks: Sage.
  7. Nitzl, C. (2010). Eine anwenderorientierte Einführung in Partial Least Square (PLS)-Methode. Hambung: Universität Hamburg, Institut für Industrielles Management. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2097324.
  8. Henseler, J. (2012). PLS-MGA: A non-parametric approach to partial least squares-based multi-group analysis. In W. A. Gaul, A. Geyer-Schulz, L. Schmidt-Thieme, & J. Kunze (Eds.), Challenges at the interface of data analysis, computer science, and optimization (pp. 495–501). Berlin: Springer. doi:10.1007/978-3-642-24466-7
  9. Chin, W. W., Mills, A. M., Steel, D. J., & Schwarz, A. (2012). Multi-group invariance testing: An illustrative comparison of PLS permutation and covariance-based SEM invariance analysis. 7th International Conference on Partial Least Squares and Related Methods, May 19-22, 2012, Houston. Retrieved from http://www.plsconference.com/Slides/PLS2012%20(Chin,%20Mills,%20Steel,%20Schwarz).pdf.
  10. Visinescu, L. (2012). PLS group comparison: A proposal for relaxing measurement invariance assumptions. 7th International Conference on Partial Least Squares and Related Methods, May 19-22, 2012, Houston. Retrieved from http://www.plsconference.com/Slides/PLS%20Group%20Comparison.pdf.
  11. Eberl, M. (2010). An application of PLS in multi-group analysis. In V. Esposito Vinzi, W. W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of partial least squares (pp. 487–534). Berlin: Springer. doi:10.1007/978-3-540-32827-8_22
  12. Henseler, J., & Fassott, G. (2010). Testing moderation effects in PLS path models: An illustration of available procedures. In V. Esposito Vinzi, W. W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of partial least squares (pp. 713–735). Berlin: Springer. doi:10.1007/978-3-540-32827-8_31
  13. Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in international marketing. Advances in International Marketing, 20(2009), 277–319. doi:10.1108/S1474-7979(2009)0000020014
Advertisements

14 thoughts on “Research note: Multi-group comparison in partial least squares (PLS) path modelling”

  1. Dear Christian,
    I’m Chaminda and I’m doing my PhD in accounting.
    Thank you very much indeed for sharing your valuable thoughts about PLS analysis.
    I use SmaprtPLS in my data analysis. However, I’m not sure about how to test control variables in the PLS model using SmartPLS. There are different explanations in various materials. Could you please help me to sort out this issue?
    Kind regards,
    Chaminda

  2. Hi Christian

    Thanks for this. I am looking at measurement invariance as per the two papers you kindly referenced. Do you know whether it is possible to adapt the R syntax provided to output weightings and loadings for comparison (as a measurement invariance test)?

    Cheers

    Tom

    1. Tom, I guess it will be possible to output weightings and loadings wir the semPLS library. In most cases, R packages are well documented. For this reason, I googled the semPLS package documentation. On p. 18 of the documentation you will find a description of the semPLS object. The package provides two functions to retrieve further information: plsLoadings(object) and plsWeights(object). While I haven’t tested it, you should get the desired information when you add the commands plsLoadings(LV) and plsWeights(LV) after the call of the semPLS method to the snippet provided above.

      Best, Christian.

  3. Hi Christian,

    Thanks for this great piece (and heads up on semPLS and R syntax) which is simple and to the point. BTW, I have used non-parametric confidence set approach for my analysis. It has come out well (mainly used for gender and age groups differences). Thanks again!

    (p.s. If you still happen to have Henseler’s spreadsheet, would you mind sharing it ? I have contacted Prof Henseler but so far no response from him)

    1. Dear Balaji, I still have the spreadsheet, but cannot give it to you as I am not the author of it. But I am quite sure that Prof. Henseler will come back to you by sending it.

      1. Dear Chris,

        No problem what so ever, I will wait for the reply from Prof Henseler.

        Thanks for the prompt response.

  4. Hi Christian, thank you for sharing your thoughts about the pls mga. If i understand right, all the Multi Group analysis are to compare the path coefficents of the two models.
    Is there any way to compare also the R-squared of the two models? I think it would be interesting to see if the differences are significant, too.
    Cheers and thank you

      1. Hi Christian, thank you for the fast reply. Honestly, the two models i want to test are exactly the same, only with different data. Therefore i am looking for a valid approach to test the differences of the R-square of the two models. The MGA Approauch which is supplied in SmartPLS is for testing the differences of the parameter estimates (e.g., outer weights, outer loadings and path coefficients). So i think, R-Squared is not meant by that.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s