.4863924 -4.67 0.000 -3.225139 -1.31656 ------------------------------------------------------------------------------ 2. Explanation of Constructed Variables logearnings = log(earnings) the ‘l’ prefixes on the independent variables indicate natural logs. For example, ls = log(s), ltenure = log(tenure), etc1. 3. Interpretation Conditional on the other characteristics that influence earnings, ‘catgov’, i.e., the indicator variable for whether an individual is employed by the government sector or not has a negative but insignificant coefficient. Therefore, there is no evidence that working for the government leads to systematically different mean hourly earnings compared to not working for the government sector. ...

Weight in 2002 however, has a negative impact on hourly earnings. Similarly, being a female leads to a systematically lower mean wage, as does being employed in a major US region. The final model was chosen on the basis of the following aspects. First, the R-squared and adjusted R-squared were found to be highest under this specification. Note from the table above the R-squared is approximately 0.36 implying that 36 percent of the variability in the natural logarithm of earnings is explained by the variability of the included independent variables. For a cross-sectional model, this is a pretty decent performance. Secondly, the included variables are the only ones that had statistically significant impacts. Finally, hours and tenure are likely to depend on hourly earnings – higher

hourly earnings are likely to motivate an individual to work longer and try to retain higher tenure. As a result of this possible endogeneity, these variables are not included as regressors. 4. Diagnostic tests for normality and heteroskedasticity The variable ‘earnings’ in levels is much more asymmetric and thus farther away from a normal distribution compared to earnings in natural logarithms. This is shown in figure 1 and figure 2 below. Figure 1 represents the histogram of earnings in levels. Figure 2 presents the histogram of earnings in natural logs. Evidently the second figure is a much closer approximation to a normal distribution. Figure 1: Distribution of earnings in levels Figure 2: Histogram of earnings in natural logs However, even the logarithm of earnings is not entirely normally distributed. A test based on skewness and kurtosis (‘sktest’ in stata) rejects the null hypothesis of a normal distribution. A