Appendix A - HCUP
National Healthcare Quality Report, 2008
Appendix A: Statistical Methods
This section explains the statistical methods and gives formulas for the calculations of standard errors and hypothesis tests. These statistics are derived from multiple databases: the NIS, the SID, and Claritas (a vendor that compiles and adds value to Bureau of Census data). For NIS estimates, the standard errors are calculated as described in the HCUP report titled "Calculating Nationwide Inpatient Sample (NIS) Variances" (Houchens, et al., 2005). We will refer to this report simply as the NIS Variance Report throughout this section. This method takes into account the cluster and stratification aspects of the NIS sample design when calculating these statistics using the SAS procedure PROC SURVEYMEANS. For population counts based on Claritas data, there is no sampling error.
Even though the NIS contains discharges from a finite sample of hospitals and most of the SID databases contain nearly all discharges from nearly all hospitals in the State, we treat the samples as though they were drawn from an infinite population. We do not employ finite population correction factors in estimating standard errors. We take this approach because we view the outcomes as a result of myriad processes that go into treatment decisions rather than being the result of specific, fixed processes generating outcomes for a specific population and a specific year. We consider the NIS and SID to be samples from a "super-population" for purposes of variance estimation. Further, we assume the counts (of QI events) to be binomial.
Section 1. Area Population QIs Using Claritas Population Data
- Standard error estimates for discharge rates per 100,000 population using the 2005 Claritas population data.
The observed rate was calculated as follows:
wi and xi, respectively, are the discharge weight and variable of interest for patient i in the NIS or SID. To obtain the estimate of S and its standard error, SES, we followed instructions in the NIS Variance Report (modified for the SID, as explained above).
The population count in the denominator is a constant. Consequently, the standard error of the rate R was calculated as:
- Standard error estimates for age/sex adjusted inpatient rates per 100,000 population using the 2005 Claritas data.
We adjusted rates for age and sex using the method of direct standardization (Fleiss, 1973). We estimated the observed rates for each of 36 age/sex categories (described in Appendix C, Age Groupings for Risk Adjustment). We then calculated a weighted average of those 36 rates using weights proportional to the percentage of a standard population in each cell. Therefore, the adjusted rate represents the rate that would be expected for the observed study population if it had the same age and sex distribution as the standard population.
For the standard population, we used the age and sex distribution of the United States as a whole according to the year 2000. In theory, differences among adjusted rates were not attributable to differences in the age and sex distributions among the comparison groups because the rates were all calculated with a common age and sex distribution.
The adjusted rate was calculated as follows (and subsequently multiplied by 100,000):
g = Index for the 36 age/sex cells.
Ng,std = Standard population for cell g (year 2000 total U.S. population in cell g).
Ng,obs = Observed population for cell g (year 2005 subpopulation in cell g; e.g., females, State of California).
n(g) = Number in the sample for cell g.
xg,i = Observed quality indicator for observation i in cell g (e.g., 0 or 1 indicator).
wg,i = NIS or SID discharge weight for observation i in cell g.
The estimates for the numerator, S*, and its standard error, SES*, were calculated in similar fashion to the unadjusted estimates for the numerator S in formula A.1. The only difference was that the weight for patient i in cell g was redefined to account for the weighting for direct standardization and the discharge weight as:
Following instructions in the NIS Variance Report (modified for the SID, as explained above), we used PROC SURVEYMEANS to obtain the estimate of S* (A.3), the weighted sum in the numerator using the revised weights (A.4), and the estimate SES*, the standard error of the weighted sum S*. The denominator of the rate is a constant. Therefore, the standard error of the adjusted rate, A, was calculated as
Section 2. Provider-Based QIs Using Weighted Discharge Data (SID and NIS)
- Standard error estimates for inpatient rates per 1,000 discharges using discharge counts in both the numerator and the denominator.
We calculated the observed rate as follows:
Following instructions in the HCUP NIS Variance Report (modified for the SID, as explained above), we used PROC SURVEYMEANS to obtain estimates of the discharge weighted mean, S/N, and the standard error of that weighted mean, SES/N. We multiplied this standard error by 1,000.
- Standard error estimates for age/sex adjusted inpatient rates per 1,000 discharges using inpatient counts in both the numerator and the denominator.
We used the full NIS estimates for the standard inpatient population age-sex distribution. For each of the 36 age-sex categories, we estimated the number of U.S. inpatient discharges,
, in category g. We calculated the directly adjusted rate:
g = Index for the 36 age/sex cells.
= Standard inpatient population for cell g (NIS estimate of the total U.S. inpatient population for cell g).
n(g) = Number in the sample for cell g.
xg,i = Observed quality indicator for observation i in cell g.
wg,i = NIS or SID discharge weight for observation i in cell g.
Note that
is the proportion of the standard inpatient population in cell g. Consequently, the adjusted rate is a weighted average of the cell-specific rates with cell weights equal to
. These cell weights are merely a convenient, reasonable standard inpatient population distribution for the direct standardization. Therefore, we treat these cell weights as constants in the variance calculations:
The variance of the ratio enclosed in parentheses was estimated separately for each cell g by squaring the SE calculated using the method of Section 2.a:
Following instructions in the HCUP NIS Variance Report (modified for the SID, as explained above), we used PROC SURVEYMEANS to obtain estimates of the weighted means, Rg, and their standard errors.
Section 3. Significance Tests
Let R1 and R2 be either observed or adjusted rates calculated for comparison groups 1 and 2, respectively. Let SE1 and SE2 be the corresponding standard errors for the two rates. We calculated the test statistic and (two-sided) p-value:
where Z is a standard normal variate.
Note: the following functions calculate p in SAS and EXCEL:
SAS: p = 2 * (1—PROBNORM(ABS(t)));
EXCEL: = 2*(1- NORMDIST(ABS(t),0,1,TRUE))