Blog closed

This was my research blog when I wrote my dissertation thesis. While I finished my thesis in 2014, I added some material later on. However, since 2018, this blog is retired and left only online for future reference.

Rare finds in academic papers: funny, disturbing and unexpected

During my dissertation thesis, I stumbled upon some rare academic papers that provide funny, disturbing or unexpected insights into a variety of topics.

Read on in case you want to know …

  • how to treat a “writer’s block”,
  • why Batman should use a parachute instead of his Batman cape,
  • how to deal with an idiot who suffered overconfidence,
  • why coffee spills when walking,
  • why it is likely that Winnie the Pooh has severe neurodevelopmental and psychosocial problems.

NB: This blog post acts as a little collection about somehow different academic papers. If you want to add to this page, feel free to contribute using the comment section below.

Continue reading “Rare finds in academic papers: funny, disturbing and unexpected”

Indicator-based monitoring of collaboration in MediaWiki

My main goal in writing this post is to make my research approach publicly available (#openresearch) in order to make it more comprehensible, even though not fully reproducible [1].  The post details my efforts to automate the evaluation of wiki work in using collaboration indicators established by using fuzzy set qualitative comparative analysis (fsQCA). Therefore, I outline the gathering, preparation, and calculation of the data necessary to employ the collaboration indicators in MediaWiki using an extension. If you look for details on the construction of the collaboration indicators, you may consult the regarding paper [2].

The post is organized as following:

  • Motivation: Why do I wanted to have inidcators evaluating collaboration MediaWiki?
  • During research: Retrieving data necessary to calculate collaboration indicators
  • A short note about the reliability and the explanatory power of the collaboration indicators
  • MediaWiki extension: Supporting the evaluation of wiki-based group collaboration based on collaboration indicators

Continue reading “Indicator-based monitoring of collaboration in MediaWiki”

Computing Co-Occurrence Matrices with Excel

This blog post has been really helpful, when studying co-occurences of codings from my content analysis. In order to give something back to the community, I attach an example file to this reblog, containing data from my literature review (see section 4.4 on interdepencies; see also the corresponding blog post).

IR Thoughts

The QA column of the current issue of IR Watch – The Newsletter features the following question:

Question: In Excel, how do you convert a term-document occurrence matrix into a term-term or document-document co-occurrence matrix?

Answer:

Let A be a matrix populated with term occurrences (frequencies).
Let ATbe its transpose.

Then, T = AAT is a term-term co-occurrence matrix, and D = ATA is a document-document co-occurrence matrix.

The following table emulates an Excel spreadsheet.

 

A

B

C

D

1  A =

d1

d2

d3

2

t1

1

3

t2

1

4

t3

1

1

1

5

 

 

 

 

6

T = AAT

t1

t2

t3

7

t1

1

1

8

t2

1

1

9

t3

1

1

3

10

 

 

 

 

11

D = ATA

d1

d2

d3

12

d1

1

1

1

13

d2

1

View original post 92 more words

GeNeMe ’14: Existieren Wissensmanagement-Schulen?

Yesterday, I presented a paper at a local conference. While I reflected the coverage of my talk within the institutional blog, I will provide the slides and the article with this blog post.

Download article

LS WIIM Blog

Gestern konnte ich auf der GeNeMe ’14 einen Beitrag präsentieren, in dem wir von unseren Bemühungen Wissensmanagement-Schulen zu identifizieren berichten. Aus dem Abstract:

Das Forschungsgebiet Wissensmanagement ist geprägt von vielschichtigen oft auch gegenläufigen Diskussionen. Die Debatte schließt unterschiedliche Forschungsbereiche und -disziplinen ein und ist dadurch in ihrer Kumulativität eingeschränkt, verschließt sich interdisziplinär-synergetischen Betrachtungen und verharrt dabei häufig auf einer Auseinandersetzung mit Grundbegriffen, Theorien, Modellen und Instrumenten. Ein erster Aufwurf zur Überwindung dieser Schwierigkeiten in der deutschsprachigen Wissensmanagement-Community erfolgte durch die Veröffentlichung und Diskussion eines Wissensmanagement-Ordnungsrahmens. Um die Vereinheitlichung der Diskussion weiter voranzutreiben, untersucht dieser Beitrag über die Einordnung von 53 Publikationen, ob die vorgeschlagenen sieben Dimensionen des Ordnungsrahmens in sich und untereinander passfähig sind und als Grundlage zur Identifikation von Wissensmanagement-Schulen herangezogen werden können. Im Ergebnis wird gezeigt, dass sich bestimmte Einordnungen gegenseitig bedingen können und dass aus den Dimensionen Cluster als Vorstufen für voneinander abgrenzbare Wissensmanagement-Schulen entwickelt werden können.

View original post 65 more words

I finished my phd thesis

Recently, I finished my doctoral thesis about Wikis in higher education, which I have made available as a download on our library’s document server. After submitting my thesis in June 2013, I defended the thesis on the 14th of March, the corresponding presentation is available on Slideshare, but in German only.

In the next weeks (or months), I will make further research findings from my thesis available for the public. This will include, this are my plans for now, a

  • MediaWiki extension which can be used for monitoring students’ collaboration, and
  • Tutorial on fuzzy set qualitative comparative  analysis.

Research note: Multi-group comparison in partial least squares (PLS) path modelling

This blog post is about the considerations I made while searching for a method to compare two studies that use the same causal model. Finally, I show how I applied the non-parametric confidence set approach to compare two groups.

Last year, I published a study about students’ intentions to use wikis in higher education. In this study, a survey was conducted with 133 first semester students to test a causal model based on the Decomposed Theory of Planned Behaviour (DTPB). A while ago, I conducted this survey with 35 graduate students. The results of the second study differed from the hypothesised causal relationship, that have been proven in the first study, and were in no accordance with expectations. For this reason, I searched for methods to investigate differences between both studies.

Moderation vs. mediation

First of all, I asked myself whether I wanted to examine mediation or moderation or both at once. Particularly helpful was the article by Henseler and Fassott [12], which provides a good overview on this topic. It became clear to me that I want to compare moderating effects through group comparison using study experience as a categorical variable, where group differences become apparent as differences in parameter estimates – as it is obviously the case in my studies. In partial least squares (PLS) path modelling, this is referred to as multi-group comparison/analysis (often abbreviated as PLS-MGA).

Measurement invariance testing

A prerequisite for multi-group comparison is measurement invariance [9, 10], while “it is often assumed that the measurement invariance is given if you use the same items for the latent variables measurement in each group” (citing a post of the SmartPLS forum). Following [9, 10], testing measurement invariance is pretty new in PLS and seldom seen in PLS research, at least at the moment. However, I decided to believe in the above stated assumption and did not tested for measurement invariance.

Nota bene: According to my literature review: [4] provides a promising method to examine measurement invariance as well as [11] introduces an example application.

Multi-group comparison

However, there are several ways to conduct a multi-group comparison of PLS models between two (!) groups (see [5] for an approach to compare more than two groups). Namely,
(1) the parametric approach, or Keil/Chin-approach [1, 2],
(2) the Smith-Satterthwaite test [3],
(3) the permutation-based approach [4],
(4) the non-parametric approach [3], or Henseler’s PLS multi-group analysis, and
(5) the non-parametric confidence set approach [5].

(1) The parametric approach
The approach requires normally distributed data, which “runs contrary to PLS path modelling’s distribution-free character” [5, p. 200]. However, the analysis is carried out by

  • running the PLS path modelling algorithm as well as the bootstrapping procedure on both groups,
  • obtaining the standard errors of the group-specific parameter estimates, and
  • running the test statistic provided by [1] when parameter estimates’ standard deviations are equal (use Levene’s test to compute significance of difference). [6] provides an Excel file that can be used to calculate the test statistic; it is provided as an additional download to the book. An video illustrating this approach using PLS Graph can be found on YouTube.

(2) Smith-Sattherthwaite test
This test also requires normally distributed data, but can be used when the parameter estimates’ standard deviations are not equal. The test statistic provided by Chin [2] was not entirely correct and has been corrected by Nitzl [7 in German; but Nitzl’s equation is also provided in 5].

(3) The permutation-based approach
The approach was developed by Chin in 2003 and is further described by Chin and Dibbern [4]. A detailed step-by-step guide can be found in [4] and in [5]. Although this approach does not rely on any distributional assumptions, it requires groups’ sample sizes to be fairly similar [5].

(4) The non-parametric approach
This approach is similar to (1) with the difference that “bootstrap estimates are used to assess the robustness of group-specific parameter estimates” [5, p. 202]. As a consequence, this approach does not build on any distributional assumptions, but can only be used to test one-sided hypotheses. The procedure is described in detail in [8]. An Excel file implementing this approach can be obtained from the first author of [13] upon request, but is also available here (last visit: 2014-04-06).

(5) The non-parametric confidence set approach
This approach does not build on any distributional assumptions and can be used to test for significant difference between two groups. The procedure is detailed in [5]. It conceptually builds on [2], but uses the group-specific bootstrap intervals for comparison. The analysis is carried out by

  • running the PLS path modelling algorithm for both groups,
  • constructing an alpha-percent-bootstrap confidence interval for both groups, and by
  • checking whether the parameter estimates of one group one fall into the confidence interval of group two and vice versa. If they do not overlap, one can assume that they are significantly different.

I selected the non-parametric confidence set approach as it does not build on any distributional assumption unlike (1) and (2), allows different groups’ sample sizes unlike (3), and can detect significant differences that are lower or higher unlike (4).

Multi-group comparison using SmartPLS and the R package semPLS

The non-parametric confidence set approach requires an alpha-percent-bootstrap confidence interval to be constructed that cannot be calculated in SmartPLS. However, I found that the R package semPLS can read SmartPLS files (ending with *.splsm) and can be used to calculate such a confidence interval. The syntax is the following

Using the output provided by the summary function you will be able to examine whether the differences between groups’ parameter estimates are significant. Therefore, you have to

  1. calculate the corrected confidence interval for both groups,
  2. copy the last lines from the output provided from the summary function that are labelled with beta_x_y (use the output from the pathCoeff function to identify the corresponding relationships) for both groups,
  3. paste the output in an Excel spreadsheet, and
  4. check whether the path coefficients of one group fall in the corresponding interval (column lower and upper) of the other group and vice versa. If the path coefficient overlaps with the confidence interval, one can assume that it is not significant different from the other group.
# install semPLS package or uncomment in case you have installed it already
install.packages('semPLS')
install.packages('XML')
# load the semPLS package
library(semPLS)
# load the data from *.csv-file
Data <- read.csv(file.choose(), sep=";")
# load the SmartPLS model that is located in your SmartPLS workspace
Model <- read.splsm(file.choose(), order = "generic")
# run PLS algorithm to calculate the model
LV <- sempls(model = Model, data = Data, wscheme = "pathWeighting")
# print the path coefficients
pathCoeff(Intention)
# bootstrap the semPLS object
set.seed(123)
Bootstrap <- bootsempls(LV, nboot = 5000, start= "ones", verbose=TRUE)
# construct the corrected confidence interval
summary(Bootstrap, type = "perc", level = 0.95)

You will get a table that looks somehow similar to mine (see Figure 1). In this example, the multi-group comparison shows no significant differences between both groups as nearly every path coefficient overlaps the confidence interval of the other group (e.g., the path coefficient of the second group for the causal relationship FC –> PBC lies with the confidence interval of the first group: 0.1692 < 0.2000 < 0.5014).

PLS-MGA
Figure 1: PLS multi-group analysis

References

  1. Chin, W. W. (2000). Frequently asked questions: Partial least squares & PLS-Graph. Retrieved from http://disc-nt.cba.uh.edu/chin/plsfaq.htm
  2. Keil, M., Tan, B. C. Y., Wei, K.-K., Saarinen, T., Tuunainen, V., & Wassenaar, A. (2000). A cross-cultural study on escalation of commitment behavior in software projects. MIS Quarterly, 24(2), 299–325.
  3. Henseler, J. (2012). PLS-MGA: A non-parametric approach to partial least squares-based multi-group analysis. In W. A. Gaul, A. Geyer-Schulz, L. Schmidt-Thieme, & J. Kunze (Eds.), Challenges at the interface of data analysis, computer science, and optimization (pp. 495–501). Berlin: Springer. doi:10.1007/978-3-642-24466-7
  4. Chin, W. W., & Dibbern, J. (2010). An Introduction to a Permutation Based Procedure for Multi-Group PLS Analysis: Results of Tests of Differences on Simulated Data and a Cross Cultural Analysis of the Sourcing of Information System Services Between Germany and the USA. In V. Esposito Vinzi, W. W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of partial least squares (pp. 171–193). Berlin: Springer. doi:10.1007/978-3-540-32827-8_8
  5. Sarstedt, M., Henseler, J., & Ringle, C. M. (2011). Multigroup analysis in partial least squares (PLS) path modeling: Alternative methods and empirical results. Advances in International Marketing, 22, 195–218. doi:10.1108/S1474-7979(2011)0000022012
  6. Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M. (2014). A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM). Thousand Oaks: Sage.
  7. Nitzl, C. (2010). Eine anwenderorientierte Einführung in Partial Least Square (PLS)-Methode. Hambung: Universität Hamburg, Institut für Industrielles Management. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2097324.
  8. Henseler, J. (2012). PLS-MGA: A non-parametric approach to partial least squares-based multi-group analysis. In W. A. Gaul, A. Geyer-Schulz, L. Schmidt-Thieme, & J. Kunze (Eds.), Challenges at the interface of data analysis, computer science, and optimization (pp. 495–501). Berlin: Springer. doi:10.1007/978-3-642-24466-7
  9. Chin, W. W., Mills, A. M., Steel, D. J., & Schwarz, A. (2012). Multi-group invariance testing: An illustrative comparison of PLS permutation and covariance-based SEM invariance analysis. 7th International Conference on Partial Least Squares and Related Methods, May 19-22, 2012, Houston. Retrieved from http://www.plsconference.com/Slides/PLS2012%20(Chin,%20Mills,%20Steel,%20Schwarz).pdf.
  10. Visinescu, L. (2012). PLS group comparison: A proposal for relaxing measurement invariance assumptions. 7th International Conference on Partial Least Squares and Related Methods, May 19-22, 2012, Houston. Retrieved from http://www.plsconference.com/Slides/PLS%20Group%20Comparison.pdf.
  11. Eberl, M. (2010). An application of PLS in multi-group analysis. In V. Esposito Vinzi, W. W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of partial least squares (pp. 487–534). Berlin: Springer. doi:10.1007/978-3-540-32827-8_22
  12. Henseler, J., & Fassott, G. (2010). Testing moderation effects in PLS path models: An illustration of available procedures. In V. Esposito Vinzi, W. W. Chin, J. Henseler, & H. Wang (Eds.), Handbook of partial least squares (pp. 713–735). Berlin: Springer. doi:10.1007/978-3-540-32827-8_31
  13. Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in international marketing. Advances in International Marketing, 20(2009), 277–319. doi:10.1108/S1474-7979(2009)0000020014

A constructivist understanding of reality

There has been a great emphasis on the philosophy of science in the German speaking Wirtschaftsinformatik/Information Systems community nearly 15 years ago. Since then it is considered good form that researchers deal with philosophy of science and express their theoretical underpinning. Consequently, I dealt with this issue. However, I am very ambivalent whether philosophy of science has or has no value for research practice. Wikipedia put it Salomonian:

Philosophy of science has historically been met with mixed response from the scientific community. Though scientists often contribute to the field, many prominent scientists have felt that the practical effect on their work is limited.

It took me some while to get into the major concepts and at least I am still not absolutely sure if it is of value for my research. One of my former colleagues described philosophy of science as “the researcher’s religion”, which might have an impact on researcher’s daily life, but does not necessarily must have an impact (in every situation). However, after reading tons of papers, I came to the conclusion that it has only impact on the research practice in a few situations, e.g. research on learning processes, knowledge sharing, etc.

Additionally, philosophy of science can be seen as an issue if a researcher uses methods, which are connected to a theoretical underpinning, that are mutually contradictory to each other. This was the case in my doctoral research, as I have used naturalistic inquiry (= constructivism) in the first part of my doctoral thesis and structural equation modelling (= positivism) in the second part. Stressing the metaphor of philosophy of science as researcher’s religion this is a severe problem as one should not change his religion on a daily basis depending on the method in use.

As it took me a while to figure out how to escape from this philosophical trap, I have decided to make my thoughts available to the public. The following section is an excerpt from my doctoral thesis. 

My theoretical foundation

In this section, I explain my epistemic position that determines my understanding of reality and truth, and thus, is the theoretical foundation of my doctoral thesis.

The research was based on a constructivist understanding of reality. I used Lincoln and Gubas [1] constructivist paradigm as an underlying framework, which they introduced as a counterpart to the positivist paradigm.[2] Positivists act on the assumption that reality is objective, tangible, and independent from the individual, whereas constructivists are convinced that reality is constructed in the minds of individuals and resides in their minds: “They do not exist outside of the persons who create and hold them; they are not part of some ‘objective’ world that exists apart from their constructors” [3, p. 143]. Hence, there are multiple, divergent constructions that are bound to the individual, the context, the method of inquiry, and some point in time [4]. Accordingly, truth is a “matter of best-informed and most sophisticated construction on which there is consensus at a given time” [5, p. 243]. On this basis, the thesis uses the notion of internal and external consensus. Pörksen [6] refers to internal consensus as the correspondence between what one communicates to others and what one holds to be real. External consensus describes the accordance with the others that they accept one’s statement as correct.

As a consequence, constructivist researchers cannot provide the truth, but they can try to achieve a consensual understanding with other researchers through discourse within the scientific community. An aggravating factor is that knowledge cannot be transferred easily, because everyone shares his own intangible construction of the reality, although some degree of co-construction is possible. However, “understanding” the construction of another researcher is not a straightforward process, because it demands to understand others’ understanding [7]. To facilitate understanding, Guba and Lincoln [4] provided constructivist researchers with four criteria for judging – and therefore understanding – the trustworthiness of qualitative research: credibility, transferability, dependability, and confirmability. Within the thesis I applied these criteria by comparing findings with existing literature, discussing results with colleagues, and subjecting drafts to peer review processes for publication. Using criteria for judging about the quality of qualitative research did not limited myself to “qualitative, constructivist research”.

It is commonly claimed that research methods are bound to paradigms and cannot be mixed together with methods from other paradigms, because scientific paradigms are incommensurable [8, 9] According to Smaling [9], this thesis does not hold due to the underdetermination of paradigms and methods: “a research method, certainly a qualitative research method, does not unequivocally imply a particular paradigm” (p. 242). At best a method is linked to a paradigm in a kind of “Wahlverwandtschaft” (Weber, 1922 as cited in [9, p. 242]). Rather, it is possible to use a particular method not in its ‘normal’ paradigm, but within another setting, and to interpret the results in the light of the paradigm in use [8].

References

  1. Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.
  2. To ease the understanding, I distinguish between two paradigms: positivism (including empirical-analytic paradigm, objectivism, functionalism) and constructivism (including subjectivism, interpretivism).
  3. Guba, E. G., & Lincoln, Y. S. (1989). Fourth generation evaluation. Newbury Park, CA: Sage.
  4. Guba, E. G., & Lincoln, Y. S. (1982). Epistemological and methodological bases of Naturalistic Inquiry. Educational Technology Research and Development, 30(4), 233–252.
  5. Schwandt, T. A. (1998). Constructivist, interpretivist approaches to human inquiry. In N. K. Denzin & Y. S. Lincoln (Eds.), The landscape of qualitative research: Theories and issues (pp. 221–259). Thousand Oaks, CA: Sage.
  6. Pörksen, B. (2009). The end of arbitrariness: The three fundamental questions of a constructivist ethics for the media. Constructivist Foundations, 4(2), 82–90.
  7. Rusch, G. (2007). Understanding: The mutual regulation of cognition and culture. Constructivist Foundations, 2(2-3), 118–128.
  8. Mingers, J. (2001). Combining IS research methods: Towards a pluralist methodology. Information Systems Research, 12(3), 240–259. doi:10.1287/isre.12.3.240.9709
  9. Smaling, A. (1994). The pragmatic dimension. Quality & Quantity, 28(3), 233–249. doi:10.1007/BF01098942

Blog carnival: Why do I blog?

Recently, the guys from the Saxon Open Online Course (SOOC) started a blog carnival and asked why professors, lecturers, and other faculty members are blogging (in German; Hilfe, mein Prof bloggt)? Although my first post in this blog covered this topic (partially), I disclose my motivation again.

About my blogging impetus, from my first post in this blog:

The final impetus to start blogging was a tweet, which called my attention to a blog post by Melissa Terras. Triggered by her university’s open access policy she started to make her research papers, that have been published in subscription based journals, publicly available in an online repository. Furthermore, Melissa started blogging and tweeting about her research with a notable effect.

At the moment and a few posts later, I can summarize that I use my blog

While I was starting this blog in February, 2013, I was quite unsure if I will use it regularly or whether I will benefit from blogging, or at least, will enjoy it. But for now, I am very pleased with the result. This has two reasons: first, page visits and download statistics indicate that novices use my research data in combination with my research article on students’ intentions to use wikis, to get into partial least squares modelling. Second, the combined number of downloads (SSRN, this blog, and other repositories) of my research articles is higher than I would have ever  thought (although this must not be a consequence of my blogging effort).

Use figshare to get a DOI for your research data

Every now and then, I want to provide my research data. But when sharing research data, I want this data not only to be accessible, but also to be citable. For this reason, I have used

Each service provides a persistent identifier (DOI or URN) or at least a more or less stable URL (Slideshare). But providing other articfacts than presentations, research paper, or data is difficult. For example, where do you provide documents related with your research? Until now, I have used Google Sites or my Dropbox to provide further information. Lately, I stumbled upon an excellent list with Online tools for researchers that called my attention to figshare.

From the website “figshare is a repository where users can make all of their research outputs available in a citable, sharable and discoverable manner. figshare allows users to upload any file format to be made visualisable in the browser so that figures, datasets, media, papers, posters, presentations and filesets can be disseminated in a way that the current scholarly publishing model does not allow.”

And indeed, figshare has some power features for depositing research data:

  • 1GB of private space,
  • Unlimited public space,
  • Upload all formats,
  • an API,
  • CC-licenses,
  • DOIs for your research outputs,
  • and the best, it’s free (at least at the moment).

In order to provide citable research outputs, I transfered a dataset that has been made openly accessible in my blog to figshare.

Although it does not support all of the criteria demanded by The Amsterdam Manifesto on Data Citation Principles, it is by far the best resource I have found to provide research data yet.