Understanding the Generalized TEFI index
Hudson Golino
Associate Professor of Quantitative Methods at the Department of Psychology, University of Virginia
Do you like Network Psychometrics, Structural Equation Modeling, & Confirmatory Factor Analysis? Have you ever wondered what to do if your bifactor model fits the data better than a correlated traits model, even when it shouldn't? Introducing: Generalized TEFI. Read more below...
Golino et al. (2021) developed a family of fit indices based on information/quantum information theory. The most accurate of them all is the Total Entropy Fit Index (TEFI).
The TEFI index computes the distance between the mean Von Neumann entropy (Von Neumann, 1927) of the dimensions (factors or communities) and the total entropy of?the system of variables, and adds a penalization to the number of dimensions estimated.
This equation integrates three key components: the average entropy across dimensions or factors (A), the number of dimensions (B), and the total entropy of the system (C). The total entropy fit index can be re-written as TEFI = [P1] + [P2], in which [P1] = A and [P2] = (C ? A) × √B. Golino et al. (2021) argues that [P1] is expected to decrease monotonically as the number of factors increases, but [P2] is expected to increase as the number of factors increase, representing the reduction in average entropy of a set of data conditional on a given dimensional (factor) structure. Finally, the square root of the number of factors was chosen in [P2] by Golino et al. (2021) to control the expected growth trajectory of [P2] as the number of factors increases. The effect of adding an additional factor would be conditional on the number of factors already being estimated in the model, showing a decreasing effect as the number of factors increases. In summary, TEFI computes the distance between the mean Von Neumann entropy of the dimensions (factors or communities) and the total entropy of the system of variables, and adds a penalization to the number of dimensions estimated.
The Generalized Total Entropy Fit Index
As can be noted by equation 3, TEFI can only be computed in single-level structures (e.g., those reflecting first-order factors or communities). At present, there’s no way for TEFI to take into consideration more complex dimensionality structures, such as bifactor structures with multiple correlated general factors. Expanding TEFI to accommodate dimensionality structures in two levels requires adding two new components to equation 3. One, [P3], representing the distance (or difference) between the sum of the Von Neumann entropy of the second-level dimensions, E, relative to (i.e., divided by) the number of first-level dimensions (first-order factors, group factors, first-order communities), B, and the total entropy of the system of variables (C). The other component to be added ([P4]), is similar to the penalization component [P2] of TEFI, but replaces A by E (the sum of the individual Von Neumann entropy of the second-level dimensions). The simplified version of The generalized total entropy fit index is, therefore:
GenTEFI = [ A + (C ? A) × √B] + [ E + (C ? E) × √B], or
GenTEFI = [P1] + [P2] + [P3] + [P4],
where
[P3] = (E/B) ? C
and
[P4] = (C ? E) × √B.
The generalized total entropy fit index can be seen as an additive fit index combining a first-order TEFI (TEFIFirst-Order = [P1] + [P2]) and a second-order TEFI (TEFISecond-Order = [P3] + [P4]).
The generalized TEFI can be, finally, formulated as:
领英推荐
Interesting Properties of The Generalized Total Entropy Fit Index
To demonstrate the applicability of the GenTEFI in comparing structures with varied organizations---such as correlated traits versus bifactor models---data were generated under two distinct conditions. In the first scenario, we generated data using a correlated traits structure, characterized by factor loadings between 0.45 and 0.75, correlations between factors ranging from 0.00 (orthogonal) to 0.70, and four variables per factor in a total of four factors. In the second scenario, the data was generated using a bifactor model comprising four group factors and one general factor, with four variables per group factor. Here, the loadings on the group factors varied from 0.45 to 0.55, while those on the general factors ranged from 0.45 to 0.70. For each model, we generated 100 datasets, each consisting of 5,000 observations, to robustly assess the characteristics of the GenTEFI (more specifically, the lower-order or first-order TEFI and the high-order or second-order TEFI values) across these different structural configurations.
The figure above is a hexagonal binning plot, and shows the relationship between factor loadings and the first and second-order TEFI under varying conditions of a correlated traits model. The use of hexagonal cells enables a clear depiction of data point density, with color gradients indicating the concentration of points---brighter yellow hues signifying greater densities.
The plot's facets correspond to different levels of interfactor correlation, ranging from orthogonal (zero correlation) to highly correlated factors, enabling a comparative analysis across the spectrum of loading magnitudes. The hexagons' borders are color-coded to denote the level of TEFI: red for second-order TEFI and gray for first-order TEFI (TEFI_{First-Order} = [P1]+[P2], TEFI_{Second-Order} = [P3]+[P4], see equation 4).
The figure reveals that, for orthogonal or weakly correlated trait structures, the first-order TEFI values are consistently lower than those of the second-order meaning the uncertainty of the correlated traits structure is lower than the bifactor structure.
The first-order TEFI was computed in a structure that mirrors the true four-factor model whereas the second-order TEFI is based on an assumed but incorrect higher-order factor.
Interestingly, as interfactor correlations strengthen, the gap between first and second-order TEFI narrows. At higher interfactor correlations, the second-order TEFI becomes less than the first-order, suggesting an emergent second-order structure.
These observations underscore two key insights regarding the GenTEFI: its sensitivity to distinguishing between correlated traits and bifactor or hierarchical structures and its responsiveness to changes in the correlation between factors, which reflects different organizational complexities within the data structure. As the correlation increases, more information is being shared between the factors, which makes them more mixed or presents a higher disorganization and higher uncertainty. As in physical systems, systems of variables with mixed states show a higher entropy, disorganization, or uncertainty than systems well-compartmentalized into distinct sets. Consequently, with increased correlations between factors, the overall system of variables has a higher entropy or uncertainty than the average entropy of its individual factors. This suggests that the global interactions of the variables contribute more to its entropy than the internal structure of its factors. Consequently, the uncertainty of a hypothetical second-order factor decreases. In other words, as the correlations between factors increase, a second-order structure emerges.
In Figure the figure above, a hexagonal binning plot shows the relationship between factor loadings and the Total Entropy Fit Index (TEFI) under varying conditions of a bifactor model. Loadings of the general factors are represented in the x-axis, and the grid represent different magnitude of the first-order factor loadings (or loadings of the group factors). The plot indicates that the values for second-order TEFI are lower than for the first-order ones, and this gap grows as the general factors' loadings increase. This distinction between the first and second-order TEFI values highlights the usefulness of the GenTEFI. In short, both Figures illustrate that GenTEFI is an effective new tool for analyzing the dimensionality of psychology data---it helps differentiate between a correlated traits model and a bifactor model.
When the underlying structure of the data is a correlated traits model, first-order TEFI values are smaller (more negative) than second-order ones.
Interested in learning more? Check our our pre-print:
Golino, H., Jiménez, M., Garrido, L. E., & Christensen, A. P. (2024, March 19). Generalized Total Entropy Fit Index: A new fit index to compare bifactor and correlated factor structures in SEM and network psychometrics. https://doi.org/10.31234/osf.io/5g3hb