International Journal of Psychology and Behavioral Sciences
p-ISSN: 2163-1948 e-ISSN: 2163-1956
2021; 11(4): 53-64
doi:10.5923/j.ijpbs.20211104.01
Received: Nov. 5, 2021; Accepted: Nov. 20, 2021; Published: Dec. 15, 2021

Mahnaz Shojaee1, Mehrdad Shahidi2, 3, Ying Cui1, Amin Mousavi4
1Centre for Research in Applied Measurement and Evaluation, University of Alberta, Canada
2Department of Education, Mount Saint Vincent University, Canada
3Department of Psychology, Tehran Central Branch, Islamic Azad University, Tehran, Iran
4Department of Educational Psychology and Special Education, College of Education, University of Saskatchewan, Canada
Correspondence to: Mahnaz Shojaee, Centre for Research in Applied Measurement and Evaluation, University of Alberta, Canada.
| Email: | ![]() |
Copyright © 2021 The Author(s). Published by Scientific & Academic Publishing.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Background: Using R-UCLA-Loneliness Scale, previous studies across Canada and other Western and Asian countries indicated that loneliness could predict some of the mental health disorders such as clinical depression, anxiety, as well as physical problems. Thus, the self-report R-UCLA-Loneliness Scale has been used increasingly to screen and monitor the sense of loneliness in people with and without mental health disorders. A brief review of previous research on the R-UCLA-Loneliness Scale revealed the lack of evaluation of the scale's fairness/equivalence in measuring loneliness among adolescents and young adults. Method: Four hundred university students aged between 18 and 26 years old participated in this study. Their responses were analyzed using item response theory (IRT) and ordinal logistic regression (OLR). To determine the type of Differential Item Functioning (DIF), uniform or non-uniform, we examined "item true score function plots". Results: The findings revealed that three items display gender-related DIF. All three items showed non-uniform DIF. The assessment of effect sizes indicated negligible differences for practical utilization. Limitations: A caution should be considered in interpreting the results of measurement invariance across Gender. The samples analyzed in this study were individuals without comorbid symptoms or other mental disorders. Thus, we recommend future researchers focus on the clinical population. Conclusion: This study was the first and major step towards presenting theoretical and practical information regarding the assessment of invariance of the R-UCLA-Loneliness Scale across gender. Future studies may look at different aspects and sources of invariance such as marital status and educational levels for different groups to strengthen conclusions concerning the R-UCLA Loneliness Scale.
Keywords: Loneliness, Differential Item Functioning (DIF), Item Response Theory (IRT), Measurement Invariance, R-UCLA Loneliness Scale, Gender
Cite this paper: Mahnaz Shojaee, Mehrdad Shahidi, Ying Cui, Amin Mousavi, Measurement Invariance of R-UCLA Loneliness Scale Across Gender Using Differential Item Functioning (DIF), International Journal of Psychology and Behavioral Sciences, Vol. 11 No. 4, 2021, pp. 53-64. doi: 10.5923/j.ijpbs.20211104.01.
Where p (Yi ≤ k) is the probability of responding at or below category k to an item for the ith person, θ (person’s latent trait levels, ability parameter; Hambleton, Swaminathan, & Rogers, 1991 cited in Kim & Yoon 2011) represents the overall level of loneliness and it is measured by the total test score, g is a grouping variable, and g*θ represents the interaction between the grouping variable and the overall level (Mousavi, et al. 2019). The difference value of -2× log-likelihood between model 1 and 3 can be used to detect two forms of DIF, uniform and non-uniform. This difference value is compared to a Chi-squared distribution with two degrees of freedom and significant result will flagged the item for DIF. To determine type of DIF, we examined “item true score function plots”, which illustrate the item characteristic curve (ICCs; also called item response function for polytomous items (ICF) and represents expected scores conditional on the trait level), to see if the DIF is uniform or non-uniform.
|
|
was used as the effect size measure for flagged items (reference lordif package). With three iterations, “lordif” package terminated flagging and three items were found displaying gender related DIF: item 3 (How often do you feel that there is no one you can turn to?), item 15 (How often do you feel you can find companionship when you want it?), and item 16 (How often do you feel that there are people who really understand you?). Graphs in Figure 1 indicate the difference in score between using scores that account DIF and those ignore for DIF. In the box plot on the left, interquartile range, representing the middle 50% of the differences (bound between the bottom and top of the shaded box), range roughly from -0.03 to +0.02 with a median of approximately -0.01. In the right graph the same difference scores are represented against the initial scores ignoring DIF (“initial theta”; Choi, et al. 2011), for both genders. Guidelines are plotted at 0.0 (solid line), that is to say no difference, and the mean of the differences (dotted line), however, in this plot these two lines are overlapped. The positive values to the left of this graph revealed that in some cases, accounting for DIF led to slightly higher scores (i.e., naive score ignoring DIF minus score accounting for DIF > 0, so accounting for DIF score is less than the naive score) for those with lower levels of loneliness, notably this appears to happen more for female individuals. The negative values to the right of this graph indicate that for those with higher levels of loneliness, accounting for DIF led to slightly lower scores, special for female individuals.![]() | Figure 1. Individual-Level DIF Impact |
![]() | Figure 2. Trait Distributions – Females vs Males |
![]() | Figure 3. Visualizing the Non-Uniform DIF for the Item # 3 in Terms of Gender |
test for uniform DIF was not significant (p = 0.567), while the 1-df test for comparing Model 2 and 3, (p = 0.001) and 2-df test for comparing Model 2 and 3, (p = 0.006) were significant.The top right plot in Figure 3 indicates the expected impact of DIF on scores with the absolute difference between the item true-score functions (Kim, et al., 2007 cited in Choi, et al., 2011). There are differences in the item true-score functions in two spots around θ = 2.00 and θ = 2.90, but the density-weighted impact, displayed in the bottom right plot, is negligible since few cases in this sample have that trait level. When weighted by the focal group trait distribution, the expected impact became negligible, which is also apparent in the small McFadden’s pseudo
as magnitude measures, printed on the top left plot. The bottom left plot in Figure 3 collocates the item response functions for male and female individuals. The difference of estimated slop parameters (1.08 vs. 0.5), printed in the graph, can also revealed the non-uniform component of DIF. Figure 4 illustrates the plots for item # 15, “How often do you feel you can find companionship when you want it?”, which shows statistically significant a non-uniform DIF. Here the difference between males and females appears to be at both higher and lower levels of loneliness. As the LR
was non-significant, this result suggests the DIF was primarily non-uniform. The item response functions suggest that non-uniform DIF was due to the first and last category threshold values for the focal group being smaller than that for the reference group (-1.84 vs. -0.76 & 2.95 vs. 1.89).![]() | Figure 4. Visualizing the Non-Uniform DIF for the Item # 15 in Terms of Gender |
was non-significant. Here the differences between male and female individuals are across almost the entire spectrum of loneliness measured by the test. McFadden’s
change for non-uniform DIF was 0.011, which is considered a negligible effect size (Penfield, Gattamorta, & Childs, 2009; Cohen, 1996). Unlike the first threshold characteristic curves (printed on the left bottom graph), the item response functions show that due to be non-uniform DIF, the category threshold parameters for focal group of two last thresholds are smaller than the reference group.![]() | Figure 5. Visualizing the Non-Uniform DIF for the Item # 16 in Terms of Gender |
![]() | Figure 6. The Impact of DIF Items on Test Characteristic Curves |