Effects of Negatively Worded Items - Accredited Research Writers

The Two-Factor Solution to Allen and Meyer’s (1990) Affective
Commitment Scale: Effects of Negatively Worded Items
Stephanie M. Merritt
Published online: 23 December 2011
Springer Science+Business Media, LLC 2011
Abstract
Purpose Affective organizational commitment is a construct
important to both practitioners and researchers; thus,
construct-valid measurement is needed. Allen and Meyer’s
(J Occup Psychol 63:1–18, 1990) Affective Commitment
Scale is a popular measure of affective organizational
commitment. Although conceptualized as unidimensional,
a two-factor solution sometimes emerges. Whether the two
factors are substantive, reflecting AC-love and AC-joy
constructs, or methodological, reflecting positively and
negatively worded items, is unclear. This issue is examined
in five studies.
Design/Methodology/Approach In a new approach, conditions
designed to produce cognitive fatigue are manipulated.
Support for the method factor interpretation would be
provided if the two-factor solution emerges only when
participants are fatigued and negatively worded items are
present. Cross-sample analyses are also conducted.
Findings Analyses indicated that the two-factor solution
fit the data well only when (a) participants were cognitively
fatigued and (b) negatively worded items were present.
This finding extended to both students and employees, and
it held regardless of which items were negatively worded
or whether negatively worded items were emphasized.
Implications Results suggest that the second factor is
likely methodological and seems to result from careless
responding or fatigue on negatively worded items. It is
suggested that users modify the items to be positively
worded or administer the scale when respondents have
sufficient cognitive resources to respond.
Originality/Value This study was the first to use experimental
methodology to test whether the two-factor solution
is methodological or substantive. Results provide guidance
for improving the scale’s construct validity in research and
in practical applications.
Keywords Affective commitment Reverse keyed items
Negatively worded items Method factor Organizational
attitudes Organizational commitment AC-joy AC-love
Affective commitment is one component of Meyer and
Allen’s (1991) organizational commitment model. Individual
differences in affective commitment relate to
important organizational outcomes including job involvement
and satisfaction, citizenship behavior, absences,
withdrawal cognitions, turnover, and job performance
(Meyer et al. 2002). Accurate measurement of individual
differences in affective commitment is essential for the
prediction and control of these outcomes.
The Affective Commitment Scale, developed by Allen
and Meyer (1990), is a widely used measure of affective
commitment. In a meta-analysis including 144 studies, the
average-N-weighted internal consistency reliability was
a = .82, and the Affective Commitment Scale was associated
with turnover (q = -.17), withdrawal cognitions
(q = -.60), citizenship (q = .26), stress (q = -.21), and
supervisor-rated job performance (q = .17) (Meyer et al.
2002). Despite these relationships, the Affective Commitment
Scale has been criticized on psychometric grounds.
The primary concern of this study is the scale’s
dimensionality. The Affective Commitment Scale is conceptualized
as unidimensional. However, a two-factor
structure is sometimes found (e.g., Bergami and Bagozzi
2000; Carmeli et al. 2006; Magazine et al. 1996;
S. M. Merritt (&)
University of Missouri – St. Louis, 421 Stadler Hall,
8001 Natural Bridge Road, St. Louis, MO 63121, USA
e-mail: merritts@umsl.edu
123
J Bus Psychol (2012) 27:421–436
DOI 10.1007/s10869-011-9252-3
Vandenberg and Self 1993). Despite the repeated appearance
of this structure, there is little consensus regarding
whether it is meaningful, reflecting two meaningful constructs,
or artifactual, representing positively and negatively
worded items. The purpose of the present study is to
determine whether the two-factor Affective Commitment
Scale solution is substantive or methodological.
At least two studies have ascribed conceptual meaning
to this two-factor solution (Bergami and Bagozzi 2000;
Carmeli et al. 2006). Bergami and Bagozzi (2000) have
termed the two factors AC-joy and AC-love. They distinguish
between a sense of happiness regarding the organization
(AC-joy) and a sense of attraction to the
organization (AC-love) as distinct emotional reactions with
distinct behavioral tendencies. This dichotomy is based on
the notion that joy and love are two separate emotional
appraisals that may have diverse action tendencies (cf.
Bergami and Bagozzi 2000). The items loading on each
factor are shown in Table 1. It seems that the items loading
on the Joy factor reflect happiness or enjoyment (e.g., ‘‘I
would be very happy to spend the rest of my career with
this organization’’), whereas those loading on the Love
factor reflect a sense of attachment or belonging (e.g., ‘‘I do
not feel like part of the family…’’). Some differential
relationships with correlates have been found for the two
factors (Bergami and Bagozzi 2000; Carmeli et al. 2006).
Others have suggested that the second factor is simply a
methodological artifact (e.g., Magazine et al. 1996). Scale
developers often include negatively worded items in scales,
sometimes in order to counteract response sets, such as
acquiescence (cf., Billiet and McClendon 2000; Marsh
1996). The notion is that when scales include a balanced
number of negatively and positively worded items, acquiescence
to the negative items will compensate for
acquiescence to the positive items, preventing bias in mean
scores.
Consistent with this perspective, the 8-item Affective
Commitment Scale includes four negatively worded items
(e.g., ‘‘I do NOT feel like a part of the family…’’), as
indicated in Table 1. Despite the potential utility of negatively
worded items in compensating for acquiescence,
several authors have demonstrated that negatively worded
items may load onto a separate ‘‘method factor’’ in factor
analyses (cf. Spector et al. 1997). Negatively worded items
have been demonstrated as one cause of inconsistency in
the observed dimensionality of the Job Diagnostic Survey
(JDS; Hackman and Oldham 1975; Harvey et al. 1985;
Idaszak and Drasgow 1987), the British Organizational
Commitment Scale (Mathews and Shepherd 2002), and
others (e.g., Conrad et al. 2004; Greenberger et al. 2003;
Melnick and Gable 1990; Motl et al. 2000; Peterson et al.
2006; Pilotte and Gable 1990).
In past research on the Affective Commitment Scale, the
negatively worded items loaded onto the AC-love factor,
while the positively worded items loaded onto the AC-joy
factor. Therefore, Magazine et al. (1996) asserted that the
second factor is a method factor related to the presence of
negatively worded items. Because past studies have
examined the Affective Commitment Scale’s two-factor
solution within a single sample and data collection, it was
not possible to test whether the second factor was substantive
or methodological.
The purpose of the present set of studies was to use
experimental manipulations to assess whether the twofactor
solution is more likely to emerge under certain
conditions. To the author’s knowledge, the present study
may be the first use of this experimental technique to test
whether an observed factor structure is affected by
Table 1 Affective
Commitment Scale items and
loadings on AC-love and ACjoy
factors
Note: AC-love and AC-joy
factor loadings from Bergami
and Bagozzi (2000) and Carmeli
et al. (2006). Item 8 was not
included in either study
Item stem Reverse? AC-love AC-joy
1 I do not feel a strong sense of belonging
to my organization
X X
2 I do not feel ‘‘emotionally attached’’
to this organization
X X
3 This organization has a great deal
of personal meaning for me
X
4 I do not feel like ‘‘part of the family’’
at this organization
X X
5 I would be very happy to spend the rest
of my career with this organization
X
6 I enjoy discussing my organization with
people outside it
X
7 I really feel as if this organization’s problems
are my own
X
8 I think I could easily become as attached
to another organization as I am to this
one
X – –
422 J Bus Psychol (2012) 27:421–436
123
negatively worded items. If the second factor is substantive,
it should emerge consistently. In other words, the
manipulations should have no effect. In contrast, if it is
artifactual and produced by negatively worded items, it
should be more likely to emerge under conditions thought
to facilitate methodological factors, as discussed next.
Two explanations for the emergence of negatively
worded factors are prevalent under the assumption that
they are methodological rather than substantive. The first is
careless responding, which may result from respondents’
lack of motivation to answer conscientiously. Schmitt and
Stults (1985) demonstrated that careless responding to
negatively worded items by only 10% of respondents can
cause the emergence of a clearly defined ‘‘negative factor’’
in principal components analyses. Woods (2006) recently
replicated this result with confirmatory factor analysis
(CFA) techniques.
A second explanation for these factors is that negatively
worded items may be more difficult to understand,
requiring more cognitive resources to process (Cordery and
Sevastos 1993; Schriesheim et al. 1991; Schriesheim and
Hill 1981). Because cognitive resources are finite, this
implies that participants should be less able to correctly
process negatively worded items when they are cognitively
fatigued—for example, when tired, hungry, or stressed.
Because both motivation and attention are likely to
decrease when respondents are cognitively fatigued, these
two sources of method factors are likely complimentary,
rather than mutually exclusive. Careless responding and
cognitive fatigue form the basis for the hypothesis tested in
the following five studies.
In Study 1, the Affective Commitment Scale was
administered to participants who were randomly assigned
to conditions in which they would be either more or less
likely to be cognitively fatigued. The factor structure of the
Affective Commitment Scale is compared between conditions.
If the two-factor solution is substantive, these
manipulations should have no influence on the observed
factor structure. However, if the second factor is related to
careless responding/fatigue on negatively worded items,
then the second factor is expected to be observed only
when respondents are fatigued.
Study 1
Participants were randomly assigned to complete the
Affective Commitment Scale either at the beginning or the
end of a lengthy (60 min) experimental task and survey
battery. Given that cognitive resources are limited and
become depleted over time, careless responding and fatigue
were expected to increase with the amount of time participants
spent on the task. Thus, it was expected that the
two-factor solution would be observed only when the
Affective Commitment Scale was administered late in the
survey (‘‘late condition’’) and not in the ‘‘early condition.’’
Participants and Procedure
Participants were 184 undergraduate students from a
medium-sized U.S. Midwestern university who participated
in return for course credit or extra credit. The sample
varied more in age than is typical of undergraduate samples
(M = 26, SD = 8.22), but was predominantly White
(74.9%) and female (81.6%).
Participants completed a survey online. They first
completed 10 demographic/descriptive items. Then, participants
in the early condition completed the Affective
Commitment Scale. Because these participants were not
required to hold jobs outside of school, they were
instructed to respond to the items in terms of their affective
commitment to the university.
Next, participants responded to 45 other self-report items
and completed a 30-trial version of the X-ray Screening
Task (Merritt and Ilgen 2008), which is a vigilance task in
which participants visually screen images of luggage and
identify weapons. Vigilance tasks involve the maintenance
of constant and careful attention over a period of time in
order to identify rare, but important, events. These tasks are
cognitively demanding (Smit et al. 2003; Warm et al. 2010)
and are known to produce cognitive/attentional fatigue. In
fact, the ‘‘vigilance decrement,’’ or the decline in vigilance
performance over time, has been demonstrated to occur
within 12 min (Temple et al. 2000) and has been the subject
of considerable research (e.g., Atchley and Chan 2011;
Grier et al. 2003; Helton and Russell 2011; Mikulka et al.
2002). Based on this body of research, the present task was
expected to deplete participants’ cognitive resources. Following
that task, participants responded to an additional 51
self-report items. Finally, participants in the late condition
then completed the Affective Commitment Scale. On
average, the study took participants approximately 1 h to
complete.
Results
In this study and all subsequent studies, participants’ scores
for negatively worded items were reverse-scored prior to
analysis. Means, standard deviations, and item intercorrelations
for each condition are presented in Table 2. The
Affective Commitment Scale had acceptable internal consistency
in both the early (a = .87) and late (a = .87)
conditions.
Scale dimensionality was assessed using CFA in LISREL
8.8 (Jo¨reskog and So¨rbom 2006). Goodness-of-fit was
evaluated using the Chi-square, RMSEA (Steiger and Lind
J Bus Psychol (2012) 27:421–436 423
123
1980), NNFI (Bentler and Bonett 1980), and CFI (Bentler
1990) statistics. Multiple fit statistics were used in recognition
of the fact that they may not always agree; using a
set of fit statistics provides a more well-rounded perspective
on a model’s fit. Increasingly good fit is indicated by
lower Chi-square and RMSEA values and higher NNFI and
CFI values. The fits of the unidimensional and two-factor
models were assessed separately for the early and late
conditions. The hypothesis was that if the two-factor
solution is substantive, then there should be no difference
in the factor structure between conditions. If, however, the
two-factor solution is methodological, then it is expected to
emerge only in the late condition (i.e., when participants
were fatigued).
In order to determine whether the Affective Commitment
Scale had the same factor structure in both conditions,
a multi-group CFA was performed. In essence, this
represents a test of configural invariance between groups. If
the scale’s factor structure is different between groups, then
configural invariance would not hold. This could mean that
the frame of reference used to respond to the measure items
may be significantly different for one group than for the
other—the implication being that attempting to compare
scale scores across groups would be ‘‘tantamount to comparing
apples and spark plugs’’ (Vandenberg and Lance
2000, p. 9). In other words, if respondents under different
scale administration conditions have different factor
structures, their scores cannot be meaningfully compared.
To perform the configural invariance test, the unidimensional
structure was modeled for both groups simultaneously,
and the overall model fit was assessed. A close
fit would indicate that the unidimensional model fit for both
groups. A poor fit would indicate that the unidimensional
model fit poorly for one or both groups, and further
investigations would be required. Results of the test did,
indeed, indicate poor model fit (v2 = 117.94, df = 28,
RMSEA = .19, NNFI = .87, CFI = .91), with the Late
group contributing 85% to the overall Chi-square. Therefore,
the factor structures were examined for each condition
independently.
CFA on the participants in the early condition yielded a
close fit for the unidimensional model (v2 = 17.08,
df = 14, RMSEA = .05, NNFI = .99, CFI = .99) requiring
no modifications. The two-factor model was also fitted
to the data and produced statistically equivalent fit (v2 =
16.15, df = 13, RMSEA = .05, NNFI = .99, CFI = .99,
Dv2 = -.93, Ddf = 1, ns), and the correlation between the
two factors was close to 1.00 (U = .97). The rule of parsimony
therefore suggests that the simpler, unidimensional
model be retained.
However, the CFA results for the Late condition indicated
poor fit for the unidimensional model (v2 = 100.86,
df = 14, RMSEA = .26, NNFI = .71, CFI = .81). Subsequently,
a CFA was tested in which the negatively
worded items and positively worded items loaded onto
separate factors. This model had significantly improved fit
(v2 = 41.18, df = 13, RMSEA = .16, NNFI = .87,
CFI = .92; Dv2 = 59.68, Ddf = 1, p\.01) with a
reduced correlation between the two factors (U = .74).
However, the model fit still failed to meet Browne and
Cudeck’s (1992) criteria for acceptable fit. An examination
of the modification indices revealed that the fit would be
greatly improved if the Hd error covariances for items 5
and 6 and items 4 and 7 were allowed to covary. Doing so
produced a close to perfect fit (v2 = 12.03, df = 11,
RMSEA = .03, NNFI = 1.00, CFI = 1.00; U = .70).
Note that allowing these same uniquenesses to covary did
not produce an acceptable level of fit for the unidimensional
model (v2 = 70.14, df = 12, RMSEA = .23,
NNFI = .76, CFI = .87). Thus, these results suggest that
in the Late condition, the two-factor model fit significantly
better than the unidimensional model even when these
Table 2 Item means, standard deviations, and intercorrelations for Study 1: participants completing the scale early and late in a survey battery
Item Early Late 1 2 3 4 5 6 7 8
M SD M SD
1 2.93 1.13 3.12 1.06 – .85** .64** .64** .25* .31** .27** .52**
2 2.57 1.13 2.84 1.15 .78** – .72** .69** .30** .38** .33** .58**
3 2.76 1.12 2.78 1.11 .74** .75** – .46** .56** .50** .56** .45**
4 2.68 1.11 2.96 1.05 .70** .69** .67** – .21* .34** .49** .48**
5 3.52 1.03 3.61 .88 .38** .29** .44** .41** – .51** .40** .21*
6 3.61 .89 3.64 .98 .54** .48** .50** .45** .41** – .34** .23*
7 2.33 .96 2.44 .99 .52** .57** .50** .49** .29** .48** – .36**
8 2.50 .84 2.62 .95 .20 .27** .25* .22* .07 .10 .25* –
Note: Correlations for the early condition are below the diagonal; correlations for the late condition are above the diagonal. List wise Ns = 94
(early condition) and 90 (late condition). Ns for means and standard deviations range from 187 to 189
** p\.01, * p\.05
424 J Bus Psychol (2012) 27:421–436
123
uniquenesses were free to covary (Dv2 = 58.11, Ddf = 1,
p\.01). Because modification indices are subject to the
effects of chance, the sample size was relatively small
(N = 90), and no particular substantive explanation is
apparent, these error covariances should only be generalized
with caution.
Discussion
The results of Study 1 support Magazine et al.’s (1996)
assertion that the two-factor solution to the Affective
Commitment Scale is likely to be methodological rather
than substantive. The hypothesized unidimensional model
fit the data closely when the Affective Commitment Scale
was administered early in the survey—that is, when participants’
cognitive resources and attention were likely at
their highest. However, when administered followed a
resource-depleting vigilance task and several self-report
items, a two-factor model with the negatively worded items
loading onto a separate factor fit the data significantly
better than the unidimensional model. These findings imply
that the two-factor solution is more likely to emerge
in situations that produce higher degrees of fatigue and
careless responding.
It is interesting to note that these differences in the
observed factor structure occurred in the absence of any
large differences in item means, standard deviations, or
internal consistencies. None of the items demonstrated
significant mean differences between conditions (ps[.05).
This suggests that violations of the Affective Commitment
Scale factor structure cannot always be detected by examining
item-level means and standard deviations or scale
reliabilities. A factor analysis was necessary to detect the
violation of the unidimensional structure.
Consistent with past discussions of negative factors, the
emergence of the second factor only at the end of the
survey is interpreted as evidence of the effects of fatigue
and careless responding on negatively worded items.
Because negatively worded items are expected to require
more cognitive resources to process, participants who are
fatigued are expected to have more difficulty responding to
negatively worded items than to positively worded items.
In Study 2, all participants completed the Affective
Commitment Scale at the end of a lengthy survey. Therefore,
all participants in Study 2 were expected to be fatigued.
However, for half of the participants, the Affective
Commitment Scale was modified so that there were no
negatively worded items (Modified condition); whereas the
other half responded to the original version of the Affective
Commitment Scale with 50% negatively worded items
(Original condition). Support for the AC-love/AC-joy
factor structure would be found if that structure emerged in
CFAs for both conditions. Support for the method factor
hypothesis would be found if the two-factor structure only
emerged in the CFA for the Original condition.
Study 2
In the second study, participants were randomly assigned to
either the original version of the Affective Commitment
Scale (with 50% negatively worded) or a modified version
in which all negatively worded items were edited to
become positively worded. For example, the item ‘‘I do not
feel like part of the family at this organization’’ was edited
to read, ‘‘I feel like part of the family at this organization’’
(see Table 3). Thus, in the Original condition, the scale had
50% negatively worded items, and in the Modified condition,
the scale had 0% negatively worded items.
Participants and Procedure
Participants were 369 undergraduate psychology and
business students who participated in return for extra
credit. Mean age was 25 years (SD = 6.65). Again, the
sample was predominantly female (70.7%) and White
(71.3%). The study was conducted online. As part of an
unrelated study, participants read three vignettes describing
events in a fictitious organization, performed cognitive
filler tasks, and responded to self-report items regarding the
vignettes. Following this, the Affective Commitment Scale
was administered at the end of the survey for all participants.
As in Study 1, they were instructed to respond to the
items with their university as the referent organization (not
the organization described in the vignettes). On average,
Table 3 Modified Affective Commitment Scale items (Study 2)
Item Original wording (reverse worded) Modified wording
1 I do not feel a strong sense of belonging to my organization I feel a strong sense of belonging to my organization
2 I do not feel ‘‘emotionally attached’’ to this organization I feel ‘‘emotionally attached’’ to this organization
4 I do not feel like ‘‘part of the family’’ at this organization I feel like ‘‘part of the family’’ at this organization
8 I think I could easily become as attached to another
organization as I am to this one
I think it would be hard to become as attached to another
organization as I am to this one
J Bus Psychol (2012) 27:421–436 425
123
the survey took approximately 30 min to complete.
Therefore, Study 2 also examined whether the results of
Study 1 would extend to a shorter time period and to a
different task.
Results
Means, standard deviations, and item intercorrelations for
the Original and Modified conditions are presented in
Table 4. In this study, Item 8 was identified as problematic
in both conditions due to low item-total correlations and
factor loadings. This finding is consistent with past work
indicating a history of low factor loadings. In a multisample
CFA, Dunham et al. (1994) found that Item 8
ranked lowest in factor strength in all five studies and that
the error component composed more than 90% of the
item’s total variance. Moreover, Culpepper (2000) found
that excluding the item improved the scale’s psychometric
properties in two samples and suggested that the item be
dropped. Therefore, Item 8 was removed from both conditions
and from the subsequent studies reported here. The
7-item Affective Commitment Scale had acceptable internal
consistency in both the Modified (a = .89) and Original
(a = .83) conditions.
The two-group configural invariance analysis performed
in Study 1 was repeated in Study 2 and produced borderline
fit statistics (v2 = 96.05, df = 28, RMSEA = .11, NNFI =
.94, CFI = .96), indicating that the two groups may have had
different factor structures. Therefore, CFAs were conducted
on the Modified and Original conditions separately.
In the Modified condition (in which there were 0%
negatively worded items), the unidimensional model
showed borderline fit (v2 = 44.84, df = 14, RMSEA =
.11, NNFI = .96, CFI = .97). It was interesting to note
that, as in Study 1, modifying the model by allowing the
Hd of items 5 and 6 to covary significantly improved
model fit, producing acceptable fit by all statistics
(v2 = 20.68, df = 13, RMSEA = .06, NNFI = .99,
CFI = .99). When the two-factor model was fitted on this
data, the fit was not significantly different from the unidimensional
model (v2 = 19.98, df = 12, RMSEA = .06,
NNFI = .99, CFI = .99), and the correlation between the
two factors was U = .98. Given the statistically equivalent
fit of the two models and the high correlation between
factors, the unidimensional model was accepted as the
more parsimonious representation of these data.
In the Original condition (50% negatively worded
items), the unidimensional model fit the data unsatisfactorily
by all statistics other than the CFI (v2 = 51.21,
df = 14, RMSEA = .13, NNFI = .92, CFI = .95). For
consistency, the model was also tested with the uniquenesses
of items 5 and 6 allowed to covary, which significantly
improved the model (v2 = 32.35, df = 13,
RMSEA = .09, NNFI = .95, CFI = .97; Dv2 = 18.86,
Ddf = 1). However, fitting the two-factor AC-love/
AC-joy model to the data statistically significantly
improved the model fit even further (v2 = 12.30, df = 12,
RMSEA = .01, NNFI = 1.00, CFI = 1.00; Dv2 = 20.05,
Ddf = 1), with a correlation of U = .78 between the two
factors. Thus, consistent with the ‘‘method factor’’
hypothesis, the two-factor model fit the data acceptably
only when negatively worded items were present in the
scale.
Discussion
In Study 2, the two-factor AC-love/AC-joy model fit significantly
better than the unidimensional model only when
negatively worded items were present. Thus, as in Study 1,
the hypothesis that the second factor is a method factor
related to negatively worded items was supported. It was
also found that by modifying the item content such that all
Table 4 Study 2 item means, standard deviations, and intercorrelations for the original and modified conditions
Item Original Modified 1 2 3 4 5 6 7
M SD M SD
1 3.09 1.10 2.95 .98 – .70** .74** .68** .44** .50** .52**
2 2.81 1.15 2.66 .96 .63** – .73** .69** .28** .42** .55**
3 2.90 .99 2.89 .97 .42** .55** – .72** .38** .58** .57**
4 2.91 1.01 2.68 .94 .56** .58** .50** – .45** .52** .54**
5 3.66 .83 3.55 .90 .34** .34** .40** .40** – .57** .31**
6 3.51 .89 3.43 .96 .31** .31** .50** .35** .48** – .44**
7 2.48 .93 2.42 .99 .26** .28** .35** .25** .30** .31** –
Note: Correlations for the Original condition are below the diagonal; correlations for the modified condition (no negatively worded items present)
are above the diagonal. List wise Ns = 169 (original condition) and 191 (modified condition). Ns for means and standard deviations range from
364-368
** p\.01, * p\.05
426 J Bus Psychol (2012) 27:421–436
123
items were positively worded, the factor structure remained
unidimensional and achieved good fit when one error
covariance was freed (items 5 and 6). Note that the suggestion
to free this error covariance is also consistent with
the results of Study 1, implying that it may be generalizable—
at least for students indicating their affective commitment
to the university. However, even when no
negatively worded items were present, the fit of the
unidimensional model was not as good as it had been in
Study 1. This suggests that participant fatigue may have
had some effect on the Affective Commitment Scale’s
factor structure even when no negatively worded items
were present.
Study 3
At the request of the anonymous reviewers, Study 3 was
conducted to address two weaknesses of Studies 1 and 2.
First, in each of the past studies, all of the negatively
worded items loaded onto the AC-love factor and all of the
positively worded items loaded onto the AC-joy factor. In
Study 3, word direction and factor were disentangled by
changing the item wordings such that some AC-love items
and some AC-joy items were negatively worded. If a twofactor
Joy/Love structure emerges regardless of wording
direction, evidence would be provided supporting this twodimensional
theoretical structure. However, if the structure
that emerges shows all of the negatively worded items on
one factor and the positively worded items on the other,
evidence would be provided that the two-factor solution is
caused by negatively worded items when participants are
fatigued.
The second addition to Study 3 was a manipulation
check. Although past research has established that vigilance
tasks are cognitively fatiguing (Smit et al. 2003;
Warm et al. 2010), Study 3 included a cognitive workload
scale to verify that the experimental task did increase
mental fatigue. This check could provide further evidence
that fatigue plays a role in the emergence of the two-factor
solution.
Finally, in order to provide a more specific test of the
fatigue hypothesis, a repeated measures design was
employed in which each respondent completed the Affective
Commitment Scale twice—once at the beginning and
once at the end of the survey. Because the same participants
were included in both the Early and Late conditions,
the factor structure is expected to be stable unless it is
affected by the fatiguing process of taking the survey, as
hypothesized.
Participants and Procedure
Participants were 118 undergraduate students recruited
from the psychology subject pool in return for extra credit.
Their average age was 25.59 years, and similar to Studies 1
and 2, the sample was largely female (73.4%) and White
(66.1%). Participants completed an online survey. First,
they rated their pre-test cognitive load and fatigue. Next,
they completed the modified Affective Commitment Scale
described below (Time 1). Following this, they completed
several self-report measures for another study and completed
a 30-item X-ray screening vigilance task. Also
during this vigilance task, participants were required to
remember an 8-digit number—a manipulation which has
successfully increased cognitive load in past work (Gilbert
and Hixon 1991). Finally, participants re-rated cognitive
workload and fatigue and re-took the modified Affective
Commitment Scale (Time 2). In total, this experiment took
approximately 45 min to complete.
Measures
Modified Affective Commitment Scale
For this study, participants were asked to respond regarding
affective commitment to their university. Furthermore, the
Affective Commitment Scale was modified to un-confound
wording direction with factor. Thus, 50% of the AC-love
items were positively worded and 50% were negatively
worded. The same was done for the AC-joy items. The
modified item wordings are displayed in Table 5.
Table 5 Study 3 items
Note: Here, ‘‘[my school]’’ is
substituted for the university’s
name
Item Factor Direction
1 I feel a strong sense of belonging to [my school] Love Positive
2 I do not feel emotionally attached to [my school] Love Negative
3 [My school] has a great deal of personal meaning for me Joy Positive
4 I do not feel like ‘part of the family’ at [my school] Love Negative
5 I would be happy to spend the rest of my college career at [my school] Joy Positive
6 I do not enjoy discussing [my school] with people outside of it Joy Negative
7 I really feel [my school’s] problems are my own Joy Positive
J Bus Psychol (2012) 27:421–436 427
123
Cognitive Workload and Mental Fatigue
Participants’ cognitive load was measured at Time 1 and
Time 2 using 5 items from the NASA Task Load Index
(NASA-TLX; Hart and Staveland 1988). This scale has
been employed in over 500 studies to date (Hart 2006).
Each item assesses participants’ strain in one of the following
areas: mental, physical, temporal (time pressure),
effort, and frustration. The response scales range from 1
(low) to 10 (high). In addition, participants’ overall mental
fatigue was measured pre- and post-task with a single item
developed for the purpose of this study: ‘‘How mentally
exhausted do you feel at this point in the experiment?’’.
The five-point response scale ranged from 1 (not at all) to 5
(extremely).
Results
Descriptive statistics are displayed in Table 6. In this
study, alpha was acceptable at Time 1 (a = .86), but not at
Time 2 (a = .65). Visual inspection of the item-total correlations
indicated that the even-numbered items (i.e., the
negatively worded items), had lower item-total correlations
than did the odd-numbered items, and even reversed
direction at Time 2. This suggests that many participants
may have had difficulty responding to the negatively
worded items at Time 2.
Manipulation checks indicated that the X-ray screening
task and accompanying self-report measures did significantly
increase participants’ cognitive load. Dependent
samples t tests were significant for each of the NASA-TLX
items, the NASA-TLX overall, and self-rated overall
mental fatigue, as shown in Table 7. This manipulation
check suggests that participants were significantly more
mentally fatigued at Time 2 than at Time 1.
Next, model fitting was conducted. Group contrasts were
conducted according to Vandenberg and Lance’s (2000)
suggestions for testing invariance across time. In this model,
Time 1 and Time 2 were modeled simultaneously, and the
uniquenesses for each item across measurement times were
permitted to covary (e.g., The uniqueness of item 1 at Time 1
was permitted to covary with the uniqueness of item 1 at
Time 2). Because this study did not concern means and
intercepts, these parameters were not scaled or estimated (as
was also the case in Studies 1 and 2). Consistent with the
other studies, between-group contrasts indicated that the
unidimensional model fit poorly across Time 1 and Time 2
(v2 = 356.82, df = 69, RMSEA = .19, NNFI = .86,
CFI = .89). Therefore, the Time 1 and Time 2 data were
analyzed separately.
Table 6 Study 3 item means, standard deviations, and intercorrelations for Time 1 and Time 2
Item Time 1 Time 2 1 2 3 4 5 6 7
M SD M SD
1 3.18 1.15 3.07 1.24 – -.22* .85** -.20* .72** -.16 .53**
2 3.08 1.10 2.81 1.21 .55** – -.25* .69** -.12 .63** .06
3 3.12 1.05 3.00 1.25 .77** .62** – -.21* .73** -.16 .56**
4 3.19 1.13 2.70 1.23 .70** .67** .64** – -.03 .57** .05
5 3.35 1.19 3.34 1.26 .60** .49** .62** .49** – -.19* .53**
6 3.58 1.02 2.31 1.06 .48** .48** .51** .52** .49** – .07
7 2.31 .95 2.41 1.04 .16 .12 .34** .09 .25** .07 –
Note: Correlations for Time 1 are below the diagonal; correlations for Time 2 are above the diagonal. List wise Ns = 118 for Time 1 and 109 for
Time 2
** p\.01, * p\.05
Table 7 Study 3 fatigue
manipulation check
* p\.05
Measure Time 1 Time 2 Cohen’sd t
M SD M SD
NASA-TLX Mental 2.00 2.03 6.41 2.44 1.96 19.10*
NASA-TLX Physical 1.76 1.66 2.86 2.24 .56 5.19*
NASA-TLX Temporal 1.89 1.93 5.31 2.71 1.46 13.26*
NASA-TLX Effort 1.86 1.93 5.46 2.28 1.71 16.49*
NASA-TLX Frustration 1.81 1.74 4.59 2.55 1.18 10.64*
NASA-TLX Total 1.86 1.53 4.90 1.91 1.76 16.81*
Fatigue Overall 1.38 .85 2.39 1.03 1.07 10.37*
428 J Bus Psychol (2012) 27:421–436
123
At Time 1, the one-factor model fit somewhat poorly
(v2 = 40.03, df = 14, RMSEA = .12, NNFI = .93, CFI =
.95), which is inconsistent with the results of the previous
studies. The fit of the two-factor AC-joy/AC-love model was
statistically equivalent (v2 = 38.22, df = 13, RMSEA =
.13, NNFI = .93, CFI = .96), with a correlation of U = .96
between the two factors. Next, the two-factor model with the
negatively worded items loading onto a second factor was
tested. The correlation between factors was U = .88, but
unlike in Study 1, this model fit the data significantly better
than the unidimensional model at Time 1 (v2 = 27.95,
df = 13, RMSEA = .10, NNFI = .96, CFI = .97; Dv2 =
12.08, Ddf = 1), indicating that the factor structure was
significantly affected by item direction even early in the
study.
In order to further explore this unexpected result, a
median split was performed on the overall fatigue measure
at Time 1 (median fatigue = 1.00). If the one-factor
structure fit significantly better for individuals reporting
lower fatigue than for those reporting higher fatigue at
Time 1, this would provide support for the notion that
fatigue is related to the emergence of a two-factor
structure.
For the lower fatigue group, the unidimensional model
fit well (v2 = 24.72, df = 14, RMSEA = .09, NNFI =
.96, CFI = .97). However, this was not the case for the
higher fatigue group (v2 = 31.47, df = 14, RMSEA =
.23, NNFI = .70, CFI = .80). This indicates that fatigue
may have produced the relatively poor model fit for Time
1, in that some participants may have entered the study
sufficiently fatigued by prior events to affect the observed
factor structure of the Affective Commitment Scale.
In analyzing the Time 2 data, median splits produced
inadmissible results; therefore, all participants were combined
into a single group. Recall that manipulation checks
indicated a significant increase in mean participant fatigue
between Time 1 and Time 2. At Time 2, the one-factor
model fit poorly, with items 2, 4, and 6 producing negative
factor loadings (v2 = 168.28, df = 14, RMSEA = .30,
NNFI = .55, CFI = .70). The two-factor AC-joy/AC-love
model also fit poorly (v2 = 166.05, df = 13, RMSEA =
.31, NNFI = .52, CFI = .70), and the two factors correlated
highly at U = .95. However, the two-factor model
according to item direction showed significantly improved
fit (v2 = 24.38, df = 13, RMSEA = .09, NNFI = .96,
CFI = .98, Dv2 = 143.90, p\.01). Unlike in the other
studies, the two factors correlated negatively at U = -.29.
A visual inspection of the Time 2 inter-item correlations
for participants below the median on mental fatigue compared
to those above the median on mental fatigue revealed
that for more-fatigued participants, items with positive
wording were significantly and negatively correlated with
the negatively worded items. For less-fatigued participants,
the inter-item correlations were lower across wording
direction than within wording direction; however, the
correlations remained positive. These results provide further
evidence that the direction of item wording combined
with fatigue produces significant effects on the observed
factor structure of the Affective Commitment Scale.
Discussion
The results of Study 3 again support the assertion that the
two-factor structure sometimes observed in the Affective
Commitment Scale is the result of the combination of
cognitive fatigue and negatively worded items. In this
study, AC-joy/AC-love factors were crossed with item
direction such that half of the items composing each theoretical
factor were negatively worded. Negatively worded
items formed their own factor, suggesting that item
wording, rather than content, drove the emergence of the
two-factor model.
This study also included a manipulation check to bolster
the assertion that participant fatigue affects the Affective
Commitment Scale factor structure. The same participants
completed the Affective Commitment Scale at both Time 1
and Time 2; thus, any differences in the observed factor
structure can be attributed to the experiment itself. The
manipulation check indicated that the experiment did significantly
increase participant fatigue, providing further
evidence in favor of the argument that the two-factor
solution is methodological and is produced by fatigue
combined with negatively worded items.
Finally, I note the decrease in alpha seen at Time 2, as
well as the negative inter-item correlations for Time 2
displayed in Table 6. While the previous studies herein and
past work have demonstrated effects on negatively worded
items’ psychometric properties (compared to positively
worded versions of the same items), past results were less
extreme than the effects observed in this study (e.g.,
Benson and Hocevar 1985; Schriesheim and Eisenbach
1995). It is possible that these extreme results were produced
by the combination of a demanding vigilance task
with the modified item directions manipulated in this study;
however, future research is needed to confirm this notion.
Study 4
Because the participants in Studies 1–3 were college students
reporting on their affective commitment to their
universities, Study 4 was conducted to address the issue of
generalizability to a broader sample of employees. Like
Study 3, Study 4 used a within-subject design in which
employees took the scale twice in order to strengthen the
inference that changes in the factor structure are associated
J Bus Psychol (2012) 27:421–436 429
123
with the process of taking the study itself. In addition, this
study examined whether the effects of fatigue on factor
structure would occur in a shorter time period than those
examined in Studies 1 and 2.
Participants and Procedure
The sample consisted of 313 full-time working adults
recruited via a market research company, http://Zoome
rang.com. This company recruits individuals from a wide
variety of occupations to complete surveys in return for
compensation at their discretion. The sample was 49.5%
male and 50.5% female. The sample was predominantly
White (86%), and also included Black (6%), Hispanic
(3%), and Asian (4%) participants. Less than 1% of participants
reported being Native American and biracial.
Their average age was 43 years, and all participants confirmed
that they worked at least 31 h per week. Participants
self-reported their job title; examples include police officer,
teacher, sales clerk, realtor, custodian, office manager, and
programmer.
Results
Descriptive statistics are displayed in Table 8. Alphas were
again acceptable for both Time 1 (a = .94) and Time 2
(a = .86), although internal consistency was somewhat
stronger at Time 1. Configural invariance tests were conducted
using the same longitudinal invariance procedure
described in Study 4.
For the unidimensional model, the goodness-of-fit statistics
provided somewhat conflicting results (v2 = 373.20,
df = 69, RMSEA = .12, NNFI = .96, CFI = .97). Thus,
while the NNFI and CFI were acceptable, the v2/df ratio
(5.41) and the RMSEA were outside of suggested acceptable
ranges. Each of the Times was then analyzed
separately.
At Time 1, the unidimensional model fit well according
to the NNFI and CFI statistics, but relatively poorly
according to the v2 and RMSEA statistics (v2 = 72.45,
df = 14, RMSEA = .12, NNFI = .97, CFI = .98). Modification
to allow the uniquenesses of items 1 and 4 to covary
produced a better-fitting model (v2 = 37.40, df = 13,
RMSEA = .08, NNFI = .99, CFI = .99). The two-factor
model (also with the uniquenesses of items 1 and 4 allowed
to covary) fit equally well (v2 = 37.28, df = 12,
RMSEA = .08, NNFI = .99, CFI = .99; Dv2 = 0.12,
Ddf = 1), with a correlation between the factors of
U = .99.
At Time 2, the uniqueness of items 1 and 4 were again
allowed to covary for the sake of consistency. At Time 2,
the unidimensional model fit poorly (v2 = 120.45,
df = 13, RMSEA = .16, NNFI = .88, CFI = .93), while
the two-factor model fit significantly better (v2 = 17.90,
df = 12, RMSEA = .04, NNFI = .99, CFI = 1.00;
Dv2 = 102.55, Ddf = 1), with a correlation of U = .66
between the factors. Thus, as in previous studies, the twofactor
model fit the data significantly better when participants
were fatigued.
Discussion
In Study 4, the effects of fatigue and negatively worded
items on the factor structure of the Affective Commitment
Scale were examined in sample of full-time employees.
Again, it was found that the two-factor model fit the data
well only when participants were fatigued. Thus, the results
of Study 4 replicated those of Studies 1–3 in an employee
sample.
Furthermore, the changes in factor structure were
observed to occur in an even shorter time period—the
survey took participants only 15 min to complete. Consistent
with the results of Study 3, this again suggests that
relatively little time is required to produce enough fatigue
Table 8 Study 4 item means, standard deviations, and intercorrelations for Time 1 and Time 2
Item Time 1 Time 2 1 2 3 4 5 6 7
M SD M SD
1 3.67 1.00 3.30 1.13 – .64** .40** .61** .37** .36** .21**
2 3.43 1.10 3.23 1.15 .76** – .55** .61** .49** .45** .29**
3 3.45 1.10 3.36 1.06 .76** .85** – .42** .69** .66** .55**
4 3.52 1.05 3.32 1.08 .80** .74** .76** – .39** .37** .25**
5 3.46 1.21 3.38 1.20 .69** .70** .74** .61** – .68** .46**
6 3.54 1.07 3.47 1.02 .69** .67** .71** .66** .69** – .51**
7 3.16 1.10 3.08 1.08 .61** .63** .67** .61** .61** .59** –
Note: Correlations for Time 1 are below the diagonal; correlations for Time 2 are above the diagonal. List wise Ns = 313 for both conditions
** p\.01
430 J Bus Psychol (2012) 27:421–436
123
to adversely affect the Affective Commitment Scale’s
factor structure.
Study 5
The previous studies provided evidence that the factor
structure of the Affective Commitment Scale suffers from
participant fatigue, especially when negatively worded
items are included. However, there may be circumstances
where it is necessary to administer the Affective Commitment
Scale to fatigued respondents. Study 2 suggested
that modifying the scale content to remove negatively
worded phrasing may preserve the intended unidimensional
structure. Nevertheless, a disadvantage of modifying item
content is that the original Affective Commitment Scale
has been validated in many previous studies. Changing
item wording may affect the scale’s relationships with
predictors, correlates, and outcomes. Therefore, it may be
useful to identify strategies to retain the Affective Commitment
Scale’s construct validity without modifying the
items themselves. The author proposes that one potential
strategy is to highlight the negatively worded items using
bold, italics, or all capitals to emphasize words such as
‘‘not.’’ If fatigue affects the Affective Commitment Scale
by decreasing motivation and attention, the highlights may
combat fatigue by capturing and directing respondents’
attention to negatively worded items.
Participants and Procedure
Participants were 262 adults employed at least 15 h per
week. These participants were also enrolled in the university
and were recruited via the psychology subject pool.
Average age was 24 years, and the sample was 63% White
and 77% female. Most participants had job tenure of
2–4 years, with a maximum of 17 years, and modal annual
income was between $20,000 and $50,000 (indicated
categorically). Fifteen participants were excluded because
they either did not provide job information or because they
indicated they worked fewer than 15 h per week, and 28
participants were excluded because they took less than
15 min to complete the entire survey battery, which consisted
of nearly 200 items. In this Study, participants were
instructed to respond to the Affective Commitment Scale in
terms of their commitment to their work organizations (not
their university). As in Study 2, the average participant
completed the study in approximately 30 min.
Participants completed a 200-item survey battery online
regarding their attitudes toward their place of employment,
boss, coworker, and several other attitude objects. At the
end of this survey, participants completed either the
emphasized or unemphasized version of the 7-item
Affective Commitment Scale.
Results
Descriptive statistics are displayed in Table 9. Alphas were
acceptable for the Emphasized (a = .86) and Unemphasized
(a = .82) conditions. The between-group configural
invariance analysis indicated poor overall fit for the unidimensional
model across conditions (v2 = 217.57, df =
28, RMSEA = .23, NNFI = .82, CFI = .88). Therefore,
the conditions were examined independently.
In this study, no scale uniquenesses were freed to covary
in either the Unemphasized or Emphasized condition. In
the Unemphasized condition (no special attention drawn to
the negatively worded items), the one-factor model fit
poorly (v2 = 74.42, df = 14, RMSEA = .21, NNFI =
.82, CFI = .88). As expected, the two-factor AC-love/ACjoy
model fit significantly better (v2 = 13.44, df = 13,
RMSEA = .02, NNFI = .99, CFI = 1.00, p\.01). The
correlation between the two factors was U = .63.
Table 9 Study 5 item means, standard deviations, and intercorrelations for the unemphasized and emphasized conditions
Item Unemphasized Emphasized 1 2 3 4 5 6 7
M SD M SD
1 3.39 .91 3.44 1.21 – .68** .52** .70** .32** .50** .29**
2 3.22 1.07 3.12 1.26 .67** – .58** .64** .36** .42** .43**
3 3.17 .99 3.09 1.14 .39** .49** – .36** .57** .64** .58**
4 3.35 1.00 3.53 1.22 .53** .51** .34** – .27** .42** .28**
5 2.25 1.21 2.18 1.32 .13 .19 .51** .20 – .52** .53**
6 3.35 1.06 3.46 1.09 .36** .35** .65** .30** .50** – .50**
7 2.60 .89 2.52 1.14 .28** .38** .52** .24* .52** .45** –
Note: Correlations for the unemphasized condition are below the diagonal; correlations for the emphasized condition (bold, underline, capitals)
are above the diagonal. List wise Ns = 94 (unemphasized condition) and 124 (emphasized condition)
** p\.01, * p\.05
J Bus Psychol (2012) 27:421–436 431
123
In the Emphasized condition (bold, underline, and capital
letters used to highlight negatively worded items),
similar results were found. Fit was poor for the unidimensional
model (v2 = 143.15, df = 14, RMSEA = .25,
NNFI = .82, CFI = .88), and results were significantly
improved for the two-factor model (v2 = 24.34, df = 13,
RMSEA = .08, NNFI = .98, CFI = .99, p\.01). Thus,
contrary to expectations, emphasizing the negatively worded
items failed to remedy the effect of such items combined
with fatigue. In this condition, the correlation
between the two factors was slightly higher at U = .69.
Discussion
The results of Study 5 suggest that simply emphasizing the
negatively worded items by bolding, underlining, and
capitalizing the word ‘‘not’’ in those items was insufficient
to overcome the effects of participants’ inattention and/or
fatigue when the AC scale was administered at the end of a
long survey. The two-factor model fit significantly better
than the one factor model in both conditions. These results
imply that careless responding on these items may not be a
matter of directing cognitive resources; it may be a matter
of possessing cognitive resources. When respondents are
cognitively fatigued, they may simply lack sufficient
mental resources to correctly process negatively worded
items—even when their attention is directed to those items.
Combined Analyses
When considered separately, the results of Studies 1, 2, and
4 each suggest that the two-factor solution emerges when
participants are fatigued and when negatively worded items
are present. For an overall test of that hypothesis, multiplegroup
contrast analyses were performed across these
studies to verify that the factor structures found in each
study are consistent. If they are, greater evidence is provided
that the same process occurred in each of these
studies. (Study 3 was excluded due to its use of different
item content and Study 5 was excluded due to its finding
that emphasized item wording is ineffective.)
Analysis 1 included the samples from the Early conditions
in Studies 1 and 4 as well as the Modified condition
from Study 2. For the unidimensional model, the v2 and
RMSEA indicated somewhat poor fit, while the NNFI and
CFI indicated good fit (v2 = 101.68, df = 42, RMSEA =
.11, NNFI = .97, CFI = .98). However, consistent with
the results of the individual studies, allowing the error
covariance of items 5 and 6 to be free significantly
improved the model fit (v2 = 58.86, df = 39, RMSEA =
.06, NNFI = .99, CFI = .99; Dv2 = 42.82, Ddf = 3). This
goodness-of-fit indicates that the unidimensional factor
structure (with the error covariance of items 5 and 6 freed)
fit the data well for each of these three samples.
Analysis 2 included the Late conditions from Studies 1
and 4 and the Original condition from Study 2. For consistency,
the error covariance of items 5 and 6 remained
free. The unidimensional model fit poorly according to all
fit statistics (v2 = 202.48, df = 39, RMSEA = .19,
NNFI = .84, CFI = .90). However, the two-factor model
fit well (v2 = 63.01, df = 36, RMSEA = .08, NNFI =
.96, CFI = .98), suggesting that the two-factor model
adequately fit the data from all three samples.
General Discussion
The results of these five studies point to a consistent pattern—
the second factor on the Affective Commitment
Scale emerged when negatively worded items were present
and when the situation was likely to produce careless
responding and fatigue. Similar results were found for both
between- and within-subjects designs and for both student
and employee samples. Furthermore, the items loaded
according to wording direction regardless of which items
were negatively worded.
These results largely disconfirm the notion that AC-joy
and AC-love are conceptually meaningful affective commitment
constructs. Were they to be meaningful, they
would have been expected to emerge consistently,
regardless of when the scale was administered. Furthermore,
the results of Study 3, which showed that positively
and negatively worded AC items loaded according to
wording direction rather than content, un-confounded
hypothesized content factor and wording direction. This
provides further evidence that the two-factor solution is
affected by the presence of negatively worded items, particularly
when participants are fatigued.
To the author’s knowledge, this research is the first to
use experimental manipulations of participant fatigue in
order to test the meaningfulness of negatively worded
factors. Drawing on the hypothesis that administering
scales to fatigued participants may produce negatively
worded factors due to these participants’ inability to correctly
process negatively worded items (e.g., Cordery and
Sevastos 1993; Schriesheim et al. 1991; Schriesheim and
Hill 1981), the present studies manipulated the degree to
which participants were fatigued when completing the
scale. The differential factor structures observed based on
fatigue conditions indicate that fatigue likely plays a significant
role in the emergence of negatively worded factors.
These findings support Meyer and Allen’s contention
that affective commitment should be viewed as unidimensional.
No changes to commitment theory appear to be
necessary; the primary implications of this study concern
432 J Bus Psychol (2012) 27:421–436
123
administration of the Affective Commitment Scale in
future applications, as discussed further below. Because
affective commitment is an important organizational construct
linked to essential outcomes, use of construct-valid
measures is essential.
Implications
One major measurement implication of the studies presented
here is that researchers who administer the Affective
Commitment Scale at different times may, in essence, be
measuring different constructs. According to Vandenberg
and Lance (2000), comparing constructs with a different
number of latent factors makes little sense, and the change
in factor structure observed from the beginning of the
study to later in the study is tantamount to gamma
change—or a change in the essential nature of the construct
(Golembiewski et al. 1976; Millsap and Hartog 1988).
Thus, the coherence of the general body of past work using
this scale may be called into question, as different studies
administering the scale under different levels of respondent
fatigue may have measured different constructs and may
not be comparable.
These studies imply that users of the Affective Commitment
Scale should take steps to avoid the emergence of
the two-factor solution. Otherwise, as Woods (2006)
describes, use of the scale will require separate analyses for
each of the two factors. Dividing the scale into two separate
factors may complicate data analysis and interpretation,
particularly if the two factors exhibit differential
relationships with other variables—as can be the case with
factors defined by negatively worded items (e.g., Lai
1994). Moreover, if the second factor is produced because
participants lack cognitive resources to correctly process
and respond to those items, then this second factor is likely
to contain a large amount of error.
One suggestion is that the Affective Commitment Scale
should be administered early in the experiment or survey
battery. Doing so could ensure that respondents have sufficient
cognitive resources to correctly process the negatively
worded items. Furthermore, participants may be
more motivated to respond early, as opposed to later, in a
long survey. When respondents possess both motivation
and cognitive resources for careful responding, then negatively
worded factors may be prevented. Nevertheless, the
results of Study 3 indicate that, even at the beginning of a
study, respondents may be sufficiently fatigued to produce
alterations in the scale’s factor structure. Thus, administrators
may wish to provide time for mental rest prior to the
administration of the survey to allow respondents to
recover mental resources. Breaks throughout the course of
the study might also be provided.
Alternatively, the factor structure may be improved by
modifying the scale items to be positively worded. Several
other authors have cautioned or advised against the use of
negatively worded items (e.g., Herche and Engelland 1996;
Marsh 1996; Peterson et al. 2006; Roszkowski and Soven
2009; Schriesheim and Eisenbach 1995). In general, this
research suggests that factor structures are improved by
using positively worded items.
However, some scale administrators may wish to use
negatively worded items to reduce the effects of acquiescence
or other response sets. Although this suggestion has
been made in the past (e.g., Anastasi 1982; Nunnally
1978), there seems to be little empirical evidence suggesting
that response sets are a common problem or that
negatively worded items are the most effective solution to
that problem. For those concerned about the effects of
acquiescence bias, alternatives to the use of negatively
worded items exist.
To reduce the effects of response sets, Barnette (2000)
has suggested using directly worded item stems but varying
the direction of the response scale. Some items would have
the response scale: (1) Strongly Disagree to (5) Strongly
Agree, whereas others would have the response scale: (1)
Strongly Agree to (5) Strongly Disagree. This technique
was found to produce scores with higher internal consistency
and variability relative to either including negatively
worded items or doing nothing (Barnette 2000). Thus, the
results of Barnette’s study imply that future users of the
Affective Commitment Scale who are concerned about
response sets could modify item content such that all items
are written in the same direction and then reverse the
response scale for some items. Doing so may alleviate the
effects of acquiescence bias without producing a negatively
worded factor.
Another strategy was tested by Mathews and Shepherd
(2002). They found that the effects of negatively worded
items on the factor structure of another organizational
commitment measure, the British Organizational Commitment
Scale (Cook and Wall 1980), were lessened (but
not eliminated) by forewarning participants that negatively
worded items would be present. In the end, Mathews and
Shepherd concurred with others who have suggested that
even with forewarning, there may be little to gain from
retaining negatively worded items. If, as implied by Study
5, the issue is that fatigued participants lack cognitive
resources to devote to processing negatively worded items,
then forewarning cannot be expected to overcome these
effects.
Limitations and Suggestions for Future Research
One potential limitation of the studies presented here is
that, consistent with the demographics of the university
J Bus Psychol (2012) 27:421–436 433
123
psychology subject pool, four of the five samples were
largely female, and all five were predominantly White.
Although I have no a priori reason to believe results would
differ by ethnicity and gender, it is recommended that the
results be replicated using additional samples and settings.
A second limitation of the present study is that while it
identified the effects that negatively worded items may
have on the Affective Commitment Scale’s factor structure,
the extent to which the emergence of two factors may
affect the scale’s criterion-related validity is unclear.
Because the second factor seems to result from careless or
improper processing of negatively worded items, it may
incorporate an increased amount of error. This error may
attenuate correlations with outcomes—unless the error
source also affects the measurement of the outcome. In
cases where the outcome measure is also self-report, contains
negatively worded items, and is administered when
participants are fatigued, the correlations may be inflated.
Thus, the emergence of a negative factor could increase the
chances of either Type 1 or Type 2 error, depending on the
outcome measure.
On the other hand, given that the two factors were
correlated in approximately the .60–.80 range, it is possible
that collapsing across the two factors may not significantly
affect correlations with outcomes. Consistent with this
idea, Carmines and Zeller (1979) examined the effects of
negatively worded factors on the criterion-related validity
of the Rosenberg Self Esteem Scale (Rosenberg 1965) and
found that the positively and negatively worded factors
were not differentially related to 16 criteria. However,
other researchers have found differential associations for
negatively worded factors (Bergami and Bagozzi 2000;
Carmeli et al. 2006). Therefore, the degree to which negatively
worded items affect the quality of the inferences
researchers seek to make regarding construct-level associations
remains an empirical question and an important
avenue for future research.
Furthermore, future research should explore potential
ways of preventing the effects that negatively worded items
and fatigue have on factor structures. One suggestion has
been to reword the scale items such that they are all positively
worded (Herche and Engelland 1996; Marsh 1996;
Peterson et al. 2006; Roszkowski and Soven 2009; Schriesheim
and Eisenbach 1995). However, the extent to
which rewording the items to be entirely positively worded
improves the scale’s validity is unclear. If some participants
are too fatigued to correctly interpret negatively
worded items, then perhaps they are also too fatigued to
respond accurately to positively worded items. This may
point to a larger, more general problem. Even when using
only positively worded items, the measurement of the
construct may still contain error introduced by fatigued
respondents, and this may affect observed relationships
with other variables. Future research should examine
techniques for preventing fatigue, identifying problematic
fatigued respondents, and minimizing the effects of fatigue
on both the factor structure and the validity of measures.
Conclusion
Affective commitment is a popular and important construct
in organizational research. The present studies resolved a
debate concerning one of the most widely used measures of
affective commitment. It was found that a unidimensional
factor structure is more construct-valid than the two-factor
AC-love/AC-joy factor structure, which appears to be
produced by the combination of negatively worded items
and participant fatigue. I hope that the suggestions provided
here will facilitate more accurate measurement of
essential organizational constructs. Continued development
and refinement of this and other measures is critical for the
advancement of organizational sciences and for the
improvement of our workplaces.
Acknowledgments The author would like to thank Therese Macan,
Wanyi Ma, and Tracy Mulderig for their assistance with this project.
References
Allen, N. J., & Meyer, J. P. (1990). The measurement and antecedents
of affective, continuance and normative commitment to the
organization. Journal of Occupational Psychology, 63, 1–18.
Anastasi, A. (1982). Psychological testing (5th ed.). New York:
Macmillan.
Atchley, P., & Chan, M. (2011). Potential benefits and costs of
concurrent task engagement to maintain vigilance: A driving
simulator investigation. Human Factors, 53, 3–12. doi:10.1177/
0018720810391215.
Barnette, J. J. (2000). Effects of stem and Likert response option
reversals on survey internal consistency: If you feel the need,
there is a better alternative to using those negatively worded
stems. Educational and Psychological Measurement, 60,
361–370. doi:10.1177/00131640021970592.
Benson, J., & Hocevar, D. (1985). The impact of item phrasing on the
validity of attitude scales for elementary school children. Journal
of Educational Measurement, 22, 231–240.
Bentler, P. M. (1990). Comparative fit indexes in structural models.
Psychological Bulletin, 107, 238–246. doi:10.1037/0033-2909.
107.2.238.
Bentler, P. M., & Bonett, D. G. (1980). Significance tests and
goodness of fit in the analysis of covariance structures.
Psychological Bulletin, 88, 588–606. doi:10.1037/0033-2909.
88.3.588.
Bergami, M. & Bagozzi, R. P. (2000). Self-categorization, affective
commitment, and group self-esteem as distinct aspects of social
identity in the organization. British Journal of Social Psychology.
doi:10.1348/014466600164633.
Billiet, J. B., & McClendon, M. J. (2000). Modeling acquiescence in
measurement models for two balanced sets of items. Structural
Equation Modeling, 7, 608–628. doi:10.1207/S15328007SEM
0704_5.
434 J Bus Psychol (2012) 27:421–436
123
Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing
model fit. Sociological Methods and Research, 21, 230–258. doi:
10.1177/0049124192021002005.
Carmeli, A., Gilat, G., & Weisberg, J. (2006). Perceived external
prestige, organizational identification, and affective commitment:
A stakeholder approach. Corporate Reputation Review,
9(1), 92–104. doi:10.1057/palgrave.crr.1550014.
Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity
assessment. Beverly Hills, CA: Sage.
Conrad, K. J., Wright, B. D., McKnight, P., McFall, M., Fontana, A.,
& Rosenheck, R. (2004). Comparing traditional and Rasch
analyses of the Mississippi PTSD Scale: Revealing limitations of
reverse-scored items. Journal of Applied Measurement, 5, 1–16.
Cook, J., & Wall, T. (1980). New work attitude measures of trust,
organizational commitment, and personal need non-fulfillment.
Journal of Occupational Psychology, 53, 39–52.
Cordery, J. L., & Sevastos, P. P. (1993). Responses to the original and
revised Job Diagnostic Survey: Is education a factor in responses
to negatively worded items? Journal of Applied Psychology, 78,
141–143.
Culpepper, R. A. (2000). A test of revised scales for the Meyer and
Allen (1991) three-component commitment construct. Educational
and Psychological Measurement, 60, 604–616. doi:
10.1177/00131640021970754.
Dunham, R. B., Grube, J. A., & Castaneda, M. B. (1994).
Organizational commitment: The utility of an integrative
definition. Journal of Applied Psychology, 79, 370–380. doi:
10.1177/00131640021970754.
Gilbert, D. T., & Hixon, J. G. (1991). The trouble of thinking:
Activation and application of stereotypic beliefs. Journal of
Personality and Social Psychology, 60, 509–517. doi:
10.1037/0022-3514.60.4.509.
Golembiewski, R. T., Billingsley, K., & Yeager, S. (1976). Measuring
change and persistence in human affairs: Types of change
generated by OD designs. Journal of Applied Behavioral
Science, 12, 133–157.
Greenberger, E., Chen, C.,Dmitrieva, J.,&Farruggia, S. P. (2003). Itemwording
and the dimensionality of the Rosenberg Self-Esteem
Scale: Do they matter? Personality and Individual Differences, 35,
1241–1254. doi:10.1016/S0191-8869(02)00331-8.
Grier, R. A., Warm, J. S., Dember, W. N., Matthews, G., Galinsky, T.
L., Szalma, J. L., et al. (2003). The vigilance decrement reflects
limitations in effortful attention, not mindlessness. Human
Factors, 45, 349–359. doi:10.1518/hfes.45.3.349.27253.
Hackman, J. R., & Oldham, G. R. (1975). Development of the job
diagnostic survey. Journal of Applied Psychology, 60, 159–170.
doi:10.1037/h0076546.
Hart, S. G. (2006). Nasa-task load index (NASA-TLX); 20 years
later. Human Factors and Ergonomics Society Annual Meeting
Proceedings (pp. 904–908). Human Factors and Ergonomics
Society.
Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX
(Task Load Index): Results of empirical and theoretical research.
In P. A. Hancock & N. Meshkati (Eds.), Human mental workload
(pp. 139–183). Amsterdam: North Holland Press.
Harvey, R. J., Billings, R. S., & Nilan, K. J. (1985). Confirmatory
factor analysis of the Job Diagnostic Survey: Good news and bad
news. Journal of Applied Psychology, 70, 461–468. doi:
10.1037/0021-9010.70.3.461.
Helton, W. S., & Russell, P. N. (2011). The effects of arousing
negative and neutral picture stimuli on target detection in a
vigilance task. Human Factors, 53, 132–141. doi:10.1177/
0018720811401385.
Herche, J., & Engelland, B. (1996). Reversed-polarity items and scale
unidimensionality. Journal of the Academy of Marketing
Science, 24, 366–374. doi:10.1177/0092070396244007.
Idaszak, J. R., & Drasgow, F. (1987). A revision of the Job Diagnostic
Survey: Elimination of a measurement artifact. Journal of
Applied Psychology, 72, 69–74. doi:10.1037/0021-9010.72.1.69.
Jo¨reskog, K. G., & So¨rbom, D. (2006). LISREL 8.8 for Windows
[Computer software]. Lincolnwood, IL: Scientific Software
International, Inc.
Lai, J. C. (1994). Differential predictive power of the positively
versus the negatively worded items of the Life Orientation Test.
Psychological Reports, 75, 1507–1515.
Magazine, S. L., Williams, L. J., & Williams, M. L. (1996). A
confirmatory factor analysis examination of reverse coding
effects in Meyer and Allen’s Affective and Continuance
Commitment Scales. Educational and Psychological Measurement,
56, 241–250. doi:10.1177/0013164496056002005.
Marsh, H. W. (1996). Positive and negative global self-esteem: A
substantively meaningful distinction or artifactors? Journal of
Personality and Social Psychology, 70, 810–819. doi:
10.1037/0022-3514.70.4.810.
Mathews, B. P., & Shepherd, J. L. (2002). Dimensionality of Cook
and Wall’s (1980) British Organizational Commitment Scale
revisited. Journal of Occupational and Organizational Psychology,
75, 369–375. doi:10.1348/096317902320369767.
Melnick, S. A., & Gable, R. K. (1990). The use of negative item
stems: A cautionary note. Educational Research Quarterly, 14,
31–36.
Merritt, S. M., & Ilgen, D. R. (2008). Not all trust is created equal:
Disposition and history-based trust in human-automation interactions.
Human Factors, 50, 194–210. doi:10.1518/001872008X
288574.
Meyer, J. P., & Allen, N. J. (1991). A three-component conceptualization
of organizational commitment. Human Resource
Management Review, 1, 61–89. doi:10.1016/1053-4822(91)
90011-Z.
Meyer, J. P., Stanley, D. J., Herscovitch, L., & Topolnytsky, L.
(2002). Affective, continuance, and normative commitment to
the organization: A meta-analysis of antecedents, correlates, and
consequences. Journal of Vocational Behavior, 61, 20–52. doi:
10.1006/jvbe.2001.1842.
Mikulka, P. J., Scerbo, M. W., & Freeman, F. G. (2002). Effects of a
biocybernetic system on vigilance performance. Human Factors,
44, 654–664. doi:10.1518/0018720024496944.
Millsap, R. E., & Hartog, S. B. (1988). Alpha, beta, and gamma
change in evaluation research: A structural equation approach.
Journal of Applied Psychology, 73, 574–584. doi:10.1037/
0021-9010.73.3.574.
Motl, R. W., Conroy, D. E., & Horan, P. M. (2000). The Social
Physique Anxiety Scale: Example of the potential consequence
of negatively worded items in factorial validity studies. Journal
of Applied Measurement, 1, 327–345.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York:
McGraw-Hill.
Peterson, N. A., Speer, P. W., & Hughey, J. (2006). Measuring sense
of community: A methodological interpretation of the factor
structure debate. Journal of Community Psychology, 34,
453–469. doi:10.1002/jcop.20109.
Pilotte, W. J., & Gable, R. K. (1990). The impact of positive and
negative item stems on the validity of a computer anxiety scale.
Educational and Psychological Measurement, 50, 603–610. doi:
10.1177/0013164490503016.
Rosenberg, M. (1965). Society and the adolescent self-image.
Princeton, NJ: Princeton University Press.
Roszkowski, M. J., & Soven, J. (2009). Shifting gears: Consequences
of including two negatively worded items in the middle of a
positively worded questionnaire. Assessment & Evaluation in
Higher Education, 35, 113–130. doi:10.1080/0260293080261
8344.
J Bus Psychol (2012) 27:421–436 435
123
Schmitt, N., & Stults, D. M. (1985). Factors defined by negatively
keyed items: The result of careless respondents? Applied
Psychological Measurement, 9, 367–373. doi:10.1177/0146621
68500900405.
Schriesheim, C. A., & Eisenbach, R. J. (1995). An exploratory and
confirmatory factor-analytic investigation of item wording
effects on the obtained factor structures of survey questionnaire
measures. Journal of Management, 21, 1177–1193. doi:10.1177/
014920639502100609.
Schriesheim, C. A., Eisenbach, R. J., & Hill, K. D. (1991). The effect
of negation and polar opposite item reversals on questionnaire
reliability and validity: An experimental investigation. Educational
and Psychological Measurement, 51, 67–78. doi:
10.1177/0013164491511005.
Schriesheim, C. A., & Hill, K. D. (1981). Controlling acquiescence
response bias by item reversals: The effect on questionnaire
validity. Educational and Psychological Measurement, 41,
1101–1114. doi:10.1177/001316448104100420.
Smit, A. S., Eling, P. A. T. M., & Coenen, A. M. L. (2003). Mental
effort causes vigilance decrease due to resource depletion. Acta
Psychologica, 115, 35–42. doi:10.1016/j.actpsy.2003.11.001.
Spector, P. E., Van Katwyk, P. T., Brannick, M. T., & Chen, P. Y.
(1997). When two factors don’t reflect two constructs: How item
characteristics can produce artifactual factors. Journal of Management,
23, 659–677.
Steiger, J. H., & Lind, J. C. (1980, May). Statistically based tests for
the number of factors. Paper presented at the annual spring
meeting of the Psychometric Society, Iowa City, IA.
Temple, J. G., Warm, J. S., Dember, W. N., Jones, K. S., LaGrange,
C. M., & Matthews, G. (2000). The effects of signal salience and
caffeine on performance, workload, and stress in an abbreviated
vigilance task. Human Factors, 42, 183–194. doi:10.1518/
hfes.45.3.349.27253.
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of
the measurement invariance literature: Suggestions, practices,
and recommendations for organizational research. Organizational
Research Methods, 3, 4–70. doi:10.1177/10944281003
1002.
Vandenberg, R. J., & Self, R. M. (1993). Assessing newcomers’
changing commitments to the organization during the first
6 months of work. Journal of Applied Psychology, 78, 557–568.
doi:10.1037/0021-9010.78.4.557.
Warm, J. S., Parasuraman, R., & Matthews, G. (2010). Vigilance
requires hard mental work and is stressful. Human Factors, 50,
433–441. doi:10.1518/001872008X312152.
Woods, C. M. (2006). Careless responding to reverse-worded items:
Implications for confirmatory factor analysis. Journal of Psychopathology
and Behavioral Assessment, 28, 186–191. doi:
10.1007/s10862-005-9004-7.
436 J Bus Psychol (2012) 27:421–436
123
Copyright of Journal of Business & Psychology is the property of Springer Science & Business Media B.V. and
its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder’s
express written permission. However, users may print, download, or email articles for individual use.