SOME COUNTRIES PRODUCE SO MUCH MORE OUTPUT PER WORKER THAN OTHERS: Instruments 3

Our data on languages comes from two sources: Hunter (1992), and, to a lesser extent, Gunnemark (1991).17 We use two language variables: the fraction of a country’s population speaking one of the five primary Western European languages (including English) as a mother tongue, and the fraction speaking English as a mother tongue. We are, therefore, allowing English and the other languages to have separate impacts.

Finally, we also use as an instrument the variable constructed by Frankel and Romer (1996): the (log) predicted trade share of an economy, based on a gravity model of international trade that only uses a country’s population and geographical features.

Our data set includes 127 countries for which we were able to construct measures of the physical capital stock using the Summers and Heston data set. For these 127 countries, we were also able to obtain data on the primary languages spoken, geographic information, and the Frankel-Romer predicted trade share. However, missing data was a problem for four variables: 16 countries in our sample were missing data on the openness variable, 17 were missing data on the GADP variable, 27 were missing data on educational attainment, and 15 were missing data on the mining share of GDP. We imputed values for these missing data using the 79 countries for which we
w6564-9