# IV - Wabash College

AN ECONOMETRIC ANALYSIS OF THE INCOME DISPARITY BETWEEN EAST ASIAN AND SOUTH AMERICAN IMMIGRANT GROUPS IN THE UNITED STATES

Deepak Shrivastava

SUBMITTED TO PROF. BARRETO AND PROF. HOWLAND

IN PARTIAL COMPLETION FOR THE REQUIREMENTS FOR ECO 31

APRIL 17, 2000

ABSTRACT

This paper deals with the income disparity between East Asian immigrants and South American immigrants in the United States. Using data from the March 1999 Current Population Survey, it is shown that East Asian immigrants on the average have a higher personal income than that of South American immigrants. However, this income differential is not due to a difference in the degree of American society’s susceptibility to one immigrant group over another. Though immigrants may be discriminated against in the job market when competing against natives, there is no evidence of any type of favoritism or discrimination toward one immigrant group over another. The notion of the Asian “model minority” is shown to not have any effect on personal income of East Asian immigrants. Furthermore, reason for the income differential is given by exploring the disparity in educational attainment between East Asian and South American immigrant groups.

Table of Contents

I. Introduction 4

II. Literature Review 6

III. Theoretical Analysis 8

IV. Empirical Results 10

A. Data 10

B. Presentation and Interpretation of Empirical Analyses 13

V. Conclusions 26

Works Cited 28

Introduction

Numerous studies have been conducted to measure the economic performance of various immigrant groups to the United States. Economic performance is usually measured by analyzing the total personal income of an immigrant. Several factors influence this level of personal income such as the level of educational attainment, whether or not the immigrant has received citizenship, or the amount of years spent in the United States, among others. Many studies show that the earnings of immigrants are significantly lower than that of natives. Though this income inequality between natives and immigrants can be understandable, due to reasons such as discrimination, racism, and overall resentment of foreigners “stealing” jobs from the American-born population, it is interesting to compare the economic performance between two different immigrant groups from two different vast regions of the world. The question as to whether there is any income disparity between East Asian immigrants and South American immigrants and if so, why this inequality exists, is the main focus of this paper.

The reason for the supposed inequality can be due to the degree of American society’s susceptibility to one immigrant group over another. Are East Asian immigrants groups more “welcome” to the United States, so to say, over South American immigrants? If so, many factors can play into this degree of susceptibility. Stereotypes rising in American society alleging East Asians to be keen in mathematics, or having a stronger work ethic, can very well allow East Asian immigrants to have an edge over other immigrant groups in the US job market. I will attempt to distinguish whether such discrimination exists.

Certainly other factors, besides discrimination, contribute to the income disparity. Education of the immigrant groups plays a large role in determining the average personal income. Years spent in the Unites States, citizenship status, age, marriage and sex will all be other be variables controlled for in order to determine whether or not East Asians are more “welcomed” by American society over South American immigrants, as measured by their economic success.

II. Literature Review

Immigrant income has been the subject of writing for many economists. Moreover, most of the papers I came across were comparing immigrant income to native income. It was difficult to find many studies that solely focused on comparing immigrant groups to each other without regarding the native population. However, the methodology in which these papers conducted their empirical analyses aided me greatly in conducting my own analysis.

Since my paper focuses on East Asians immigrants and their economics performance versus South American immigrants, I found it very helpful to read “Education, Occupational Prestige, and Income of Asian Americans” by Herbert Barringer, David Takeuchi, and Peter Xenos. Going by the notion of Asian Americans being labeled as the “model minority”, they attribute the success of Asian Americans (measured by income) over other ethnic groups to the high levels of education in the Asian American community (which includes immigrants). If in fact, Asians are looked upon as the “model minority”, this stereotype should be in the favor of East Asian immigrants. US employers could favor East Asians as opposed to other immigrant groups. This paper also helped me to incorporate the “Assimilation Factor” in order to account for the time spent in the United States- the longer an immigrant stays in the US, the higher his/her income.

“Minority Concentration and Earnings Inequality: Blacks, Hispanics, and Asians Compared” by Marta Tienda and Ding-Tzann Lii focused on the ethnic groups, but not immigrants. This paper did however, compared the incomes of the groups, and concluded that the inequality was due to the varying educational levels between the groups. This provided me with a good idea as to importance of education playing a major role in the income disparity between the immigrant groups in my study.

From papers that concentrated on earnings in general (not necessarily those of immigrants), I attained the notion of using Age and Age^2 to take into account as some of the variables in my regression analysis. Age has a directly proportional effect on personal income. However, after higher ages, the effect is diminished and has decreasing effect on personal income (thus the Age^2 component).

Along the same logic, I assumed that Years in US^2 would be a necessary variable in my analysis. The effect of remaining in the US when one has stayed for many years decreases. For example, there might be a difference between an immigrant who as been in the US for 2 years, and an immigrant who has been in the US for 8 years. However, there is not much difference (in their effect on personal income) of an immigrant who has been here for 32 years as opposed to 38 years. The effect of Years in the US decreases has one spends more and more time in the US.

My paper will concentrate on the income differential between East Asian immigrants and South American immigrants. If the claims above are true, there should be a real statistical difference in not only the incomes, but also the educational levels between the two groups. Furthermore, the “model minority” notion will be analyzed to see whether this special status exists, benefiting the East Asian immigrant community, and partially accounting for the income differential.

Theoretical Analysis

In measuring the economic performance of an immigrant group, numerous variables come into consideration. Economic performance, demonstrated by the average personal income of an immigrant group, depends on several factors. The ability of an immigrant to attain a job relies on the amount of education attained, years in the United States, citizenship status, sex, marital status, age, and if their residency is in a metropolitan area.

After reading several papers on immigrant economic performance and earnings (mostly comparing those to natives’ earnings), I came up with the above dynamics that would be factored into my regression equation as independent variables. When dealing with the predicting of income, age, marital status and being male all are directly proportional to personal income. When further analyzing immigrant income, citizenship status, years in the US, and urban residency are considered to be main contributing factors (all directly proportional) to personal income.

My main focus of this paper is to observe the disparity in income among two different immigrant groups. Thus, I included a dummy variable East Asian. Those with a 1 for the variable were immigrants from China, Japan, S. /N. Korea, Taiwan, and Hong Kong. Those with 0 for the variable were immigrants from Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guyana, Peru, Uruguay, and Venezuela.

Aside from the East Asian variable, education will have a significant impact on personal income. It is possible that the average amount of education attained by the two immigrants is a real statistical difference, and not due to chance. This could play an important role as to why an income disparity between the two immigrant groups exists.

In order to determine whether being East Asian or South American effects the personal income of immigrants, it is vital to perform a regression analysis with the general formula given below:

Y = β0 + β 1 * X1 + β 2 * X2 + β 3 * X3 + … … … … + β n * Xn + (

where Y is the dependent variable, in this case personal income, and Xn signifies the independent variables. The Betas are the true values of the coefficients to the independent values, while the epsilon represents the error term (taking into account omitted variables, measurement error, and the chance process from my sample).

In order to model the error term to some precision, I utilized the Standard Econometric Gaussian Error Box Model. Certain assumptions concerning the box need to be stated:

1 – the average of the box is zero

2 – the error terms are identically distributed

3 – the errors are independent of each other

4 – the errors are not correlated with any of the independent variables

Providing these hold true, and that I use an Ordinary Least Squares method for my regression analysis, my regression analysis will be the best linear unbiased estimator. Using JMP to perform all the regression analysis and applying other statistical tools provided by JMP will further aid me in analyzing the income disparity between East Asian and South American immigrants to the United States.

Empirical Results

The Data

All data attained for my empirical results are from the March 1999 Supplement of the Current Population Survey (CPS). The CPS is a national survey administered monthly to 50,000 households across the United States. Conducted by the Bureau of the Census in conjunction with the Bureau of Labor Statistics, this survey has been carried out for the past 50 years. The sample is scientifically selected to wholly represent the Unites States civilian non-institutional population and is regarded as the primary source of labor force characteristics. Respondents are interviewed and data concerning the employment status of each member of the household over 15 years of age is obtained. It is important to note that this is the best data available, which is closest to a simple, random, survey.

The dependent variable in my analysis will be Personal Income. This is comprised of a person’s total income for the previous year. This excludes taxes, or any other deductions for the year 1998. Note, however, that my whole sample is only comprised of East Asian and South American immigrants to the United States. In my original sample there was a significant amount of zeroes (0) of personal income. Because of the fact that many immigrants might have very recently immigrated, the possibility lies that they have not had sufficient time to attain a job, but still might be qualified for one and most likely get one within a short period of time. As a result, out of my original sample of 1791 observations, I discarded 263 of whom had “0” as he/she’s personal income.

The independent variables included in my analysis are East Asian, Age, United States Citizen, Education, Ever Married, Female, Years in the United States, and Immigrated to Urban Area.

A table describing all variables is given below:

|Variable |description |

|Personal Total Income | |

| |This is the dependent variable, showing the dollar amount of an observation’s |

| |total income for the year 1998, before taxes or any other deductions. |

|East Asian | |

| |A dummy variable signifying if one is an East Asian immigrant, or a South American|

| |immigrant. |

| | |

| |1 = immigrated from China, Japan, South/North Korea, |

| |Taiwan, or Hong Kong |

| |0 = immigrated from Argentina, Bolivia, Brazil, Chile, |

| |Colombia, Ecuador, Guyana, Peru, Uruguay, or |

| |Venezuela |

|Age | |

| |Age at the time of when the survey was taken (1999) |

| |Ages range from 15 to 90. |

|United States Citizen | |

| |Another dummy variable, signifying whether the immigrant is a naturalized citizen,|

| |or still retains citizenship from the country of origin. |

| | |

| |1 = Naturalized foreign-born citizen |

| |0 = Not a citizen |

|Education | |

| |Re-coded from the original CPS codes (see JMP file), years of education |

| |attainment. |

|Ever Married | |

| |Another dummy variable representing whether or not one has been married before. |

| | |

| |1= has been married before, includes those who are |

| |separated, divorced, etc. |

| |0 = never been married |

| | |

|Female | |

| |Another dummy variable representing sex. |

| | |

| |1 = female |

| |0 = male |

|Years in the United States | |

| |Years spent in the United States since immigration. |

|Immigrated to Urban Area | |

| |Another dummy variable. Represents whether a respondent immigrated to an urban |

| |area. |

| | |

| |1 = urban |

| |0 = other |

Description of Variables

The summary statistics of all the variables obtained from the MS Excel spreadsheet are shown below:

|Variable |Mean |SD |Min |Max |n |

|AGE |42.8560209 |15.0584177 |15 |90 |1528 |

|EDUCATION |13.4751309 |3.51077068 |1 |21 |1528 |

|YRS IN US |17.8334424 |11.114112 |3 |50 |1528 |

|CITIZEN |0.43 |0.50 |0 |1 |1528 |

|MARRIED |0.79 |0.41 |0 |1 |1528 |

|FEMALE |0.52 |0.50 |0 |1 |1528 |

|EAST ASIAN |0.47 |0.50 |0 |1 |1528 |

|URBAN |0.96 |0.19 |0 |1 |1528 |

|PERS INCOME |$ 27,222.49 |$ 36,584.26 |1 |434730 |1528 |

Presentation and Interpretation of Empirical Analyses

In order to answer the question as to whether there exists a difference between the incomes of East Asian and South American Immigrants, we cannot rely on our single Guassian error box model. We must extend our notions of the single box model to a two-box model. Instead of testing a hypothesis about a single population, we are more interested in a comparison of two populations.

In this case, the two populations will be East Asians, with the dummy variable East Asian equal to one (1), and South Americans, with the dummy variable East Asian equal to zero (0). Below is pictured the statistics on Personal Income broken down by East Asian and South American:

|East Asian |Personal Income |Total |

|0 |Average |$ 25,518.26 |

| |StdDev |$ 34,361.83 |

| |StdError |$ 879.05 |

| |Max |$ 434,730.00 |

| |Min |$ 1.00 |

|1 |Average |$ 29,145.08 |

| |StdDev |$ 38,874.11 |

| |StdError |$ 994.49 |

| |Max |$ 371,156.00 |

| |Min |$ 1.00 |

|Total Average of PERS | |$ 27,222.49 |

|INCOME | | |

|Total StdDev of PERS | |$ 36,584.26 |

|INCOME | | |

|Total Max of PERS INCOME| |$ 434,730.00 |

|Total Min of PERS INCOME| |$ 1.00 |

The average personal income for East Asian immigrants from my sample was $ 29,145.08 give or take $ 994.49. The average personal income for South American immigrants from my sample was $ 25,518.26 give or take $ 879.05. Though the means of personal income differ by approximately $3600 between East Asian and South American immigrant groups in my sample, this is not conclusive evidence to state that this income disparity exists for the whole population. With the large standard deviations (SD’s) and the possibility of chance influencing my sample results, I cannot claim that East Asian immigrants, on the average, have a higher personal income than South American immigrants do.

Here is where the two-box model is utilized. It is first necessary to set-up a hypothesis test with both a null and an alternative:

Do East Asian immigrants to the US earn more than do South American immigrants to the US?

Null Hypothesis: The average personal income of East Asian immigrants is equal to the average personal income of South American immigrants.

Alternative Hypothesis: The average personal income of East Asian immigrants is higher than that of South American immigrants.

In order to test such hypothesis, a Standard Error (SE) of the Difference is needed to obtain a z-statistic. A p-value is then obtained to determine whether the difference is real or not.

By bootstrapping the sample SD for both our populations we can calculate the SE of each population’s sample average income. The formula for the SE of the sample difference is a function of the SE’s of both the populations’ sample average, shown below as:

SE of the sample difference = [pic]

The SE of the sample difference attained was 1327.30. Next, we need to calculate the

z-statistic using the following formula:

z = [pic]

With our observed average income differential in our sample being $3626.82 and the SE of the difference being 1327.30, the z-statistic obtained is 2.7324. Using a normal distribution the probability value of attaining such results is 0.345%. This means that, assuming that there is no income differential between East Asian and South American immigrants, the probability of us obtaining (from our sample) a mean income differential of $3626.82, is 0.345%. This p-value is extremely low, allowing us to reject the Null Hypothesis.

Asserting that our Null Hypothesis is not true, we then safely claim the Alternative Hypothesis to be true. The difference between average personal income of East Asian immigrants to the United States as compared to the average personal income of South American immigrants to the United States is real. East Asian immigrants to the United States have a higher average personal income than South American immigrants to the United States do.

The second part of my analysis explores whether being an East Asian immigrant or South American immigrant influences one’s personal income. We have already seen above that there exists a real difference in the average personal incomes of East Asian and South American immigrants. However, we need to control for confounding variables in order to determine whether being East Asian or South American has a significant impact on the amount of earnings an immigrant makes in the United States.

The general form of my regression equation to analyze the effect my independent variables have on my dependent variable of personal income is given below:

Personal Income = β 0 + β 1 * Age + β 2 * Citizen + β 3 * EastAsian + β 4 * Education + β 5 * EverMarr + β 6 * Female + β 7 * YearsInUS + β 8 * YearsInUS2 + β 9 * Urban + β 10 * Age2 + β 11 * (Female * EverMarr) + (

Providing that:

β i = Coefficients of the parameters of the independent variables

( = The error term, taking into account omitted variables, measurement error, and the chance process

It is again necessary to state that the Standard Econometric Gaussian Box Model is being applied. In order for the model to be valid, I am working under certain assumptions. The following are needed to be true in my model in order for the Gaussian Error Box Model to hold and for my results to be feasible:

1 – the average of the box is zero

2 – the error terms are identically distributed

3 – the errors are independent of each other

4 – the errors are not correlated with any of the independent variables

Providing that the above assumptions are all true, the regression I will run will be the Ordinary Least Squares (OLS) estimator. With the Gaussian Error Box model, the OLS estimator is the Best Linear Unbiased Estimator (BLUE). This means that the estimator will have its sampling distribution centered on the true value (β). In accordance with the Gauss-Markov Theorem, OLS will also have the smallest Standard Error when compared to all other linear unbiased estimators.

However, because I have attained all of my data from the March 1999 Current Population Survey, I have theoretical reasons to believe that I will encounter heteroscedasticity in my cross-sectional data. Heteroscedasticity is present when the error terms for the independent variable are not identically distributed, a violation of the 2nd assumption of the Gaussian Error Box Model (as explained above). If indeed, there is heteroscedasticity present, the plot of the residuals versus the independent variable should have a definite shape (a wedge, football, etc.), as the expected formless cloud will not be present (as in the case of homoscedasticity).

When looking at Personal Income as a function of Education, and then plotting the residual, we have the following graph:

Eyeballing the plot, we see that the residuals (which are good estimations of the errors) seem to be unevenly distributed. As the amount of education increases, the spread of errors increases.

Thus we have reason to believe that there is heteroscedasticity present with our independent variable Education. In order to accurately detect heteroscedasticity (as opposed to simply eyeballing), we need to refer to the Goldfeld-Quandt test. This test-statistic will allow us to determine how likely it is that the difference in the size of the residuals of the two groups (in this case, immigrants with “high” educational attainment and immigrants with “low” educational attainment) is due to chance. In my sample, I organized the lower third of the sample (sorted by educational attainment) as the low-dispersion group. The high-dispersion group was the upper third of my sample with higher educational attainment. Applying my regression model to both groups, we can obtain the Goldfeld-Quandt test statistic with the following formula:

G-Q test Statistic = [pic]

Where:

RSS = Residual Sum of Squares from Group 1 (the low-dispersion group) and Group 2 (the high-dispersion group).

[pic] = the number of observations in the supposedly low dispersion group

[pic] = the number of observations in the supposedly high-dispersion group

Under the Null hypothesis, that there is no heteroscedasticity present, the G-Q test statistic should be a little more than 1, under an F-distribution. Regarding Education as the independent variable in question, the GQ statistic I obtained was 8.1479 with a p-value virtually 0 (obtained from JMP). Hence we can reject the null that there is no heteroscedasticity present, and proceed to take steps to correct the unevenly distributed error terms.

Since there is heteroscedasticity present, any estimation of the Beta-coefficients obtained from running the OLS regression is not feasible. The estimations themselves will be unbiased, but the standard errors will be imprecise. In order for us to revert back to the Gaussian Error Box Model, we need a weighing term to multiply our whole regression model to account for the unevenly distributed errors. Using the trial and error method, I finally came up with a weighing term of:

[pic], used as the weighing variable when running the regression in JMP.

The GQ-statistic obtained after using the weight was 1.0798 with a p-value of 0.1959. We cannot reject the null hypothesis here because of our favorable, relatively large (over 5%) p-value. Hence, we can conclude that we have transformed the regression equation successfully to take into account the heteroscedasticity present in education.

Again, it is important to note that because of the nature of cross-sectional data, heteroscedasticity will violate the assumptions of the Gaussian Error Box Model. Education was not the only variable in which heteroscedasticity was present.

By eyeballing, the residual plot of Age seems to be unevenly distributed:

Looking at the residual plot of Years in the United States, it seems as though heteroscedasticity might be present here as well:

I applied the same technique to Age and Years in the United States in calculating the GQ-test statistic in order to determine if indeed heteroscedasticity is present. After testing positive to heteroscedasticity, I then used the trial and error method to obtain a weight which would give GQ-test statistics close to one (1) and p-values over 5%. The table below displays the GQ –test statistic and p-values (after and before the weight was applied to the whole regression) and the actual weight utilized:

|Variable |GQ |p-value |Weight |GQ |p-value |

| |(before weighted) |(before weighted) | |(after weighted) |(after weighted) |

|Education |8.1479 | ................

................

#### To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.