The Precision of the Overall Data-Model Fit for Different Design Features in Confirmatory Factor Analysis

Factor Analysis (FA) is the study of variance within a group. Within-Subject Variance (WSV) is affected by multiple features in a study context such as the Experimental Design (ED) or the Sampling Design (SD). The aim of this study is to provide an empirical evaluation of the influence of different aspects of ED and SD on WSV in the context of FA in terms of model precision. The study results showed that the precisions of the overall model fit indices TLI and CFI, as functions of VTF, STV, h, and their interaction, varied, as did the precisions of the overall model fit indices GFI, AGFI, and RMSEA as functions of VTF, STV, and their interactions. Overall, when the VTF is 4:1 or 7:1, the required STV is 16:1 or above 32:1 or above to show precision in factor solution. Keywords-model-precision; factor-analysis; model-fit; modeldesign

INTRODUCTION Factor Analysis (FA) is a useful and flexible analytic family of methods that plays a critically important role in many empirical applications. FA is the study of variance within a group, as opposed to statistical analysis, which focuses on partitioning variance among groups [1]. FA is extensively applied in education and behavioral science research. A recent two-year analysis (from 2003 to 2005) of the use of FA methods as indexed by the PsycINFO database revealed that more than 1,700 studies used some form of FA [2]. However, FA is a generic term for a family of statistical techniques, including Exploratory FA (EFA) and Confirmatory FA (CFA). The fundamental purpose of EFA is to identify unknown latent constructs in a relatively large set of measured indicator variables that can summarize (or reproduce) the observed covariance or correlation pattern among a set of indicator variables. EFA is often used when the researcher may not have specific expectations of the number of constructs or factors underlying the dimensional structure of the observed correlational pattern, or even in cases when the researcher has emergent ideas about the underlying dimensional structure of the observed correlational pattern among a larger set of indicator variables [1][2][3]. When conducting an EFA study or analysis, the researcher faces a multitude of methodological and technical decisions. For example, there are two different EFA statistical models to choose from: (a) the full component model, or (b) the common factor model [4][5][6]. The implications of choosing either one need to be well understood. CFA is a form of factor analysis that tests hypotheses regarding how well the measured indicator variables represent the number of constructs [7,8]. CFA is a confirmatory method that can be used to examine, evaluate, and/or test the number of hypothesized factors underlying the variance/covariance in a set of measured indicator variables. CFA allows testing hypothetical and plausible alternative latent variable structures for the observed indicator variance/covariance [9,10]. More recently, CFA has also been used in exploratory analysis too. Both EFA and CFA attempt to understand the variance of the observed indicator variables through studying the Within-Subject Variance (WSV). It is well understood that WSV is affected by many features of the study conduct, such as the study Experimental Design (ED) and the Sampling Design (SD). Thus, anything that influences or changes variance may affect the conclusions related to FA. Previous studies have isolated one or two elements within ED and SD [8,11,12]. However, to the best of our knowledge, no study provided a comprehensive examination of multiple WSV factors, yet this is precisely what a researcher must do when planning a WSV study.
To understand the impact of ED and SD or other influences on WSV, a systematic structure for evaluating WSV changes is necessary. One possibility to evaluate WSV systematically is to use Factorial Invariance (FIV) [13][14][15][16][17]. The FIV methods offer a structure that allows disentangling measurement elements from structural elements in the factor model. Via FIV and the evaluation of data-model fit, the impact of ED and SD on WSV can be compared among groups by examination of model precision [1,12,13,18,19]. Previous researches investigated the precision of factor solutions by the examination of chisquare value (χ ଶ ) and Overall Model Fit (OMF) indices such as Goodness-of-Fit Index (GFI), Adjusted GFI (AGFI), Tucker-Lewis Index (TLI), Comparative Fit Index (CFI), and Root Mean Square Error of Approximation (RMSEA) [3,9,[20][21][22] There are three key features of design that are of paramount importance and generally overshadow all the technical decisions a researcher faces. These three features are: (a) the selection and the number of indicator variables, (b) the nature and size of the sample, and (c) the communality magnitude. Understanding the impact of Variable-To-Factor (VTF) ratio, sample size or Subject-To-Variable (STV) ratio, and communalities (h 2 ) magnitude in FA analysis is relevant because these features affect the model precision and operationalized (measured) latent variable (factor) variance, which determines the model invariance of FA findings. The benefit of FA is based on its ability to produce well-built, reliable, and understandable estimates of factor loadings [23]. Therefore, understanding how VTF, STV, and h 2 interact in FA and how they possibly influence or change the model precision and operationalized (measured) latent variable (factor) variance is the basic problem investigated in this study.
The model precision in this research is operationalized along psychometric and not statistical lines. Statistically, precision is inversely related to the standard error of the sampling distribution and related to the minimizing of the standard error of a statistic. Psychometrically, precision can mean this, but additionally, in a reliability context, it can also refer to the accuracy of the estimator to be near (or the same) as the theoretical latent variable (e.g. the true score) [9,13]. Thus, as the standard error of measurement decreases the precision/accuracy of the observed scores converges to the true score. However, no comprehensive study has been found in the existing literature to have systematically examined the incremental or combined impacts of two features of ED and SD and the optimum way to estimate the model. Therefore, evaluating the impact of ED and SD effects on WSV in FA findings is the basis of the proposed Monte Carlo simulation study [24,25].

II. ED: VARIABLE-TO-FACTOR RATIO
Previous research indicates that CFA yields more precise results when each common factor is represented by multiple indicator variables in the analysis [5,21,26]. While this concern is related to model identification [1,14,18,27,28], this is not the focus given here. Specifically, given an identified model, the way the sample and number of indicator variables inform the understanding of the latent factor is an important question which has not received great attention in the CFA literature [2,12,17,[28][29][30]. Authors in [18] concluded that the VTF ratio was important for factor stability with more indicator variables per factor yielding more stable result. However, the researchers who investigated VTF ratio have not reached a mutual decision on an optimal VTF. Authors in [18,31] concluded that a VTF of 3:1 is sufficient for factorial precision, whereas authors in [11] found that 24.6% of the studies published in Journal of Personality and Social Psychology (JPSP) and 34.4% of the studies published in Journal of Applied Psychology (JAP) have VTF ratio of 4:1 or less. However, other researchers who have investigated the effects of indicator variable sampling did not attempt to systematically manipulate conditions that could potentially affect the pattern stability of CFA findings. The issue of variable sampling has been used extensively in conceptual development, but the existing literature has received almost no empirical evaluation that generally has sampled indicator variables at random from the universe of variables. The assumption of random sampling is useful to minimize sampling issues and for developing generalizability rather than a prescription for applied research procedures.
III. UNDERLYING FACTOR STRUCTURE One of the broad domains in social science is the Big Five Personality Traits, which is used to describe human personality and includes extraversion, agreeableness, openness, conscientiousness, and neuroticism. Most commonly used in academic psychology, this model incorporates five different factors into a conceptual model for describing personality traits [3,8,32]. It has been selected as the structural model for this study because of its wide use in social sciences. The model structure of the Big Five Personality Traits theory has received favorable attention from researchers in the psychological discipline. The author in [3] concluded that the measurement structure of the Big Five Personality Traits is an orthogonal solution and the variation on each one of the Big Five Personality Trait dimensions is commonly proposed to be independent of variation on each of the others.

IV. SD: SUBJECT-TO-VARIABLE RATIO
Authors in [4, 12, 13, 15, 18-19, 22, 26, 33-36] found a mixed range in FA sample sizes described by either the absolute size of the sample or the STV ratio. Authors in [37] reviewed 60 studies utilizing FA and found the average minimum sample size was 42, and the minimum STV ratio was 3.25:1, with most of the studies using a STV ratio less than 5:1. Similarly, authors in [11] reviewed FA studies published in JPSP and JAP (from 1991 to 1995) and found that 18.9% of the published articles utilizing FA in JPSP and 13.8% in JAP had an average minimum sample size of 100 or less. Authors in [2] examined publications utilizing FA from the PsychINFO database between 2003 and 2005. They found that 15.4% of the studies reported STV from 10 to 20:1 and only 3% of the studies used a STV of 20-100:1. High absolute sample size or STV ratio is important to predict precise outcomes, increase the generalizability of the findings, and maximize the accuracy of population estimates [2,4,7,22,29]. There are many common practice rules of sample size in the literature, most of them not empirically based [18]. Moreover, a limited number of studies have empirically investigated the effect of STV on model precision [38]. Selecting the adequate sample size is an important decision in study design. The researcher must determine how large the sample should be and what is the most appropriate sampling frame. There are tremendous guidelines for estimating an adequate sample size for FA [2,4,11,35,39]. construct [7,20,40]. Communality is the sum of the squared factor loadings for observed variables variances accounted for by all the factors [17,19]. Communality measures the percentage of variance in the observed variables explained by all the factors. The larger the communality for each variable, the more successful the factor analysis solution is, and the smaller the communality, the more questionable the solution [4,31,39,41]. Communality ranges between 0 and 1. If the communality exceeds 1.0, there is something wrong with the data, which may reflect model specification or SD problems. Low values of communalities across the set of observed variables indicate the variables are marginally related to each other and the factors provide little explanation of the variance in the observed variables [2,20,26,39].

VI. METHODS
The current study was designed to investigate the empirical evaluation of the influence of different aspects of ED and SD on WSV in terms of model precision and operationalized (measured) latent variable (factor) variance relative to a known factor structure via Monte Carlo simulations. The study manipulated: (a) the VTF ratio (4:1, 7:1, and 10:1) randomly sampled from a population of 100 indicator variables, (b) the STV ratio from 2:1 to 32:1 in multiples of 2 (2:1, 4:1, 8:1, 16:1, and 32:1), and (c) the communality magnitude (high, moderate, low, and mixed). These factors were varied in a known factor structure with: (a) continuous variables (measurement scale), (b) normal distribution, (c) 5-factor solutions (common factor), and (d) orthogonal solution (factor structure). The precision of factor solution was evaluated by the examination of CFA for an orthogonal 5-factor (common factor) model. Chi-squared value and overall model fit indices criteria were used to evaluate the models for all conditions. Chi-squared value and four model fit indices were treated as Dependent Variables (DVs) in threeway Analysis of Variance (ANOVA) using the level of h 2 , VTF ratio, and STV ratio as independent variables. Figure 1 illustrates the study design. VII. RESULTS The Monte Carlo simulation populated 60 three-way cells between subjects' factorial design matrix: h 2 (high, moderate, low, and mixed) by VTF ratio (4:1, 7:1, and 10:1) and STV ratio (2:1, 4:1, 8:1, 16:1, and 32) with 1000 replicate samples (see Figure 1). Chi-squared values and overall model fit indices (GFI, RMSEA, TLI, and CFI) were treated as DVs in parallel three-way ANOVA. To control for multiplicity among DVs a type I error rate was adjusted by a Bonferroni correlation: 0.05/7=0.007.
The descriptive statistics for the chi-squared values among all design cells are presented in Table I. The overall ANOVA findings for the ߯ ଶ revealed the statistically significant model, F(59,59940)=137546, p<0.0001. As can be seen in Table II, there were statistically significant results for the main effects of VTF and STV, and a statistically significant interaction between VTF*STV. Surprisingly, no statistically significant main effect or interaction involving communality was found. Post hoc analysis of the VTF*STV interaction focused on splitting out the levels of STV and examined five one-way ANOVAs for the levels of VTF. Visual examination of the mean ߯ ଶ values suggests a small, but significant decrease in mean ߯ ଶ values as STV increases among all levels of VTF that decreased in magnitude as VTF increased (see Figure 2). Table  III presents the simple ANOVA effect on VTF for each STV level. Pairwise comparisons among the levels of VTF at each level of STV, e.g. VTF 4:1 vs. 7:1 @ STV=2:1, were statistically significant with all p-values <0.0001, and the directional pattern in means confirmed the decreasing trends seen in Figure 2.

B. Goodness-of-Fit Index
The averaged GFI statistics over 1000 replications for the 60 cells in the three-way design are presented in Table IV. The standard threshold of acceptable fit percentage is 0.90.
The overall ANOVA findings for GFI revealed an overall statistically significant model, F(59,59940)=138368, p<0.0001. As can be seen in Table V, there were statistically significant main effects of VTF and STV, and a statistically significant interaction between VTF*STV. Once again, the communality factor was not significant.
Post hoc analysis of VTF*STV interaction focused on splitting out the levels of STV and examined five one-way ANOVAs for the levels of VTF. Visual examination of the mean GFI values suggests a significant increase in mean GFI values as STV increases mixed with decreasing mean GFI as VTF ratios increased (see Figure 3).  VTF at each level of STV, e.g. VTF 4:1 vs. 7:1 @ STV=2:1, were statistically significant with all p-values <0.0001, and the directional pattern in means confirmed the patterns seen in Figure 3.  Table VIII, there were statistically significant main effects VTF and STV, and a statistically significant interaction between VTF*STV. Once again, the communality factor was not significant. Post hoc analysis of VTF*STV interaction focused on splitting out the levels of STV and examined five one-way ANOVAs for the levels of VTF. The visual examination of the mean RMSEA values suggests a significant decrease in mean RMSEA values as STV increases and an increasing mean RMSEA as VTF ratios decreased (see Figure 4).  Table IX presents the simple ANOVA effect on the VTF for each STV level. Pairwise comparisons among the levels of VTF at each level of STV, e.g. VTF 4:1 vs. 7:1 @ STV=2:1, were statistically significant with all p-values <0.0001, and the directional pattern in means confirmed the patterns seen in Figure 4.

D. Non-Normed-Fit Index
Non-normed-fit index (or Tucker-Lewis Index-TLI) statistics were averaged over 1000 replications for the 60 cells in the three-way design and are presented in Table X p<0.0001. As can be seen in Table XI, there were statistically  significant main effects and interactions, including a  statistically  significant  triple interaction between h 2 *VTF*STV. The visual examination of the mean TLI values reveals a differential increase in mean TLI within levels of STV as a function of h 2 magnitude. In the h 2 =high condition, the mean TLI values show minimal gains even between 2:1 and 4:1 STV levels. Within the mixed and moderate h 2 conditions, the mean TLI values show asymptotic gains after STV>4:1. However, in the low h 2 condition, the mean TLI values were markedly lower in the 2:1 and 4:1 STV levels, only showing asymptotic values when STV>8:1 (see Figure 5). Post hoc analysis of the 3-way interaction of h 2 *VTF*STV first focused on the simple effect interaction of h 2 *VTF after blocking on STV, specifically the test of 2-way interaction h 2 *VTF at each level of STV. The results revealed statistically significant 2-way interactions at all STV levels: STV=(2:1), F(6,11988)=160.34, p<0.0001; STV=(4:1), F(6,11988)=40.07, p<0.0001; STV=(8:1), F(6,11988)=18.51, p<0.0001; STV=(16:1), F(6,11988)=3.48, p=0.0019; and STV=(32:1), F(6,11988)=0.43, p=0.0115. Further analysis of the four by three 2-way interactions focused on the simple-simple effects of STV blocking on the STV*h 2 interaction. Specifically, the analysis examined differences in mean TLI values among VTF levels for each STV*h 2 interaction. Table XII presents these findings. Fig. 5.
Non-normal-fit index mean values for the interaction between STV and VTF ratios at different levels of communalities.  Visual examination of the mean CFI values reveals a differential increase in mean CFI within the levels of STV as a function of h 2 magnitude. In the h 2 = high condition, the mean CFI values evidence minimal gains even between 2:1 and 4:1 STV levels. Within the mixed and moderate h 2 conditions, the mean CFI values show asymptotic gains after STV>4:1. However, in the low h 2 condition, the mean CFI values were markedly lower in the 2:1 and 4:1 STV levels, only showing asymptotic values when STV>8:1 (see Figure 6).
Post hoc analysis of the 3-way interaction h 2 *VTF*STV first focused on the simple effect interaction h 2 *VTF after blocking on STV, specifically the test of 2-way interaction h 2 *VTF at each level of STV. The results revealed statistically significant 2-way interactions at all STV levels: STV=(2:1),  Comparative-fit index mean values for the interaction between STV and VTF ratios at different levels of communalities. VIII. DISCUSSION The study findings refuted some of the guidelines found in the literature, e.g. authors in [18] reported that sample size was not an important factor in determining model stability, and authors in [4] reported that the subject-to-variable ratio should be no lower than 5. The results of the current study revealed that sample size did have a strong effect on both stability and precision of the simulated models. For instance, when the VTF ratio was 4:1, the mean values related to data-model fit indices were adequate at STV ratio >= 4:1. However, looking at the frequency of rejections based on conventional thresholds over the 1000 replications depicted a different conclusion. The percentage of stable (invariant) models ranged from 77% at 4:1 STV to 91% at 32:1 STV clearly indicating that larger STV ratios are related to higher stability levels with a model. These findings validated the authors in [42], who reported that the percentage of invariance tests varied depending on the sample size of the group. The findings of the current study do agree with those in [43], where in some models a STV of 30:1 was needed to produce a stable model and minimize the amount of misfit.
The study findings also contradicted some previous research on the effect of VTF ratio on the precision and stability of factor solutions. Authors in [18] concluded that the VTF ratio was important for factor stability with more indicator variables per factor yielding more stable result. In the current study, there was a trend in the findings over all RQ1 analyses that suggests that data-model fit diminished as VTF increased. Most probably this is a result of the increasing complexity of the measurement models, i.e. the number of paths. For example, in the 10:1 VTF condition there are 100 estimated paths, whereas in the 4:1 condition there are 40 estimated paths. The accumulation of many small deviations from the population correlation matrix impacted negatively the global fit statistics more in the high VTF conditions than in the low VTF conditions. The overall model fit indices TLI and CFI all varied as functions of VTF, STV, h 2 , and their interaction. Chi-square value and overall model fit indices GFI, and RMSEA only varied as functions of VTF, STV, and their interactions.

IX. CONCLUSION
This study provided empirical evaluation of the influence of ED and SD on WSV in terms of model precision relative to a known factor structure via Monte Carlo simulations. The results revealed that the means of ߯ ଶ and RMSEA values significantly decrease as STV increases and increase as VTF ratio decreased. However, the examination suggests significant increase in mean GFI values as STV increased, and decreasing mean GFI values as VTF ratio increased.
Overall, when the VTF is 4:1 an STV ratio of 16:1 or above is required to show precision in factor solution and stability of the model as indicated by four fit indices. When the VTF ratio is 7:1 or 10:1, an STV ratio of 32:1 or above is required to show precision in factor solution and the stability of the model. If a researcher is interested in minimizing misfits or meeting more than four overall model fit indices criteria, STV and VTF ratios more than 32:1 and 10:1 respectively would be necessary in order to gain a precise and stable model.