A nonparametric approach to the truncated regression problem

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

40 Scopus Citations
View graph of relations

Author(s)

  • Kwok-Leung Tsui
  • Nicholas P. Jewell
  • C. F. J. Wu

Detail(s)

Original languageEnglish
Pages (from-to)785-792
Journal / PublicationJournal of the American Statistical Association
Volume83
Issue number403
Publication statusPublished - Sept 1988
Externally publishedYes

Abstract

A description is given of a new method of estimating the regression parameters in the linear regression model from data where the dependent variable is subject to truncation. The residual distribution is allowed to be unspecified. The method is iterative and involves estimation of the residual distribution under the truncated sampling scheme. The technique can be interpreted as an iterative bias adjustment of the observations in order to correct the regression relationship in the sampled population to match that of the model. A simulation study compares the performance of various estimators, including one suggested by Bhattacharya, Chernoff, and Yang (1983). This truncation regression problem arises in many contexts of scientific and social research. In economics Tobin (1958) analyzed household expenditure on durable goods using a regression model that took account of the fact that the expenditure is always nonnegative. A more general situation was studied by Hausman and Wise (1976, 1977) in connection with negative income-tax experiments. Another example concerning the schooling and earnings of low achievers was studied by Hansen, Weisbrod, and Scanlon (1970). There is also a controversy in astronomy involving Hubble’s law and Segal’s chronometric theory (Nicoll and Segal 1982; Turner 1979). Both theories predict a straight line relating the negative log of luminosity and the log of velocity as measured by red shift for celestial objects. The problem is complicated by the fact that objects of low luminosity are not visible, and hence all data relating to them are unobserved. Holgate (1965) described a biological example. A truncated linear regression model is defined as y = xTβ + e, where x is a vector of covariates, β is the vector of parameter of interest, and e is independent of x with mean 0 and cumulative distribution F. The datum (x, y) is observed only if y ≤ y0. The truncation point y0is known. Based on n independent observations (xi, yi) with yi≤ y0, it is desired to estimate β and F. Note that this differs from the censored regression model where data (x, y) with y > y0is observed but with the y value set to y0. The procedures described in the article are easily extended to truncation from below and the situation where the truncation points vary across observations. It is straightforward to see that the ordinary least squares estimate of β is inconsistent. A common method of dealing with this problem is to assume that the error distribution F is Gaussian and proceed with standard parametric methods. In many applications this assumption may not be reasonable. Hence there is interest in developing nonparametric methods of estimation that do not rely on assumptions about F. In this article a new approach for estimating β is introduced. The method allows the error distribution F to be arbitrary and is general enough to handle multiple linear regression. The rank-based method of Bhattacharya et al. (1983), which was designed for simple linear regression, is compared with the proposed method using a simulation study. The new approach appears to give estimators with good bias and efficiency properties in a wide variety of situations. © 1976 Taylor & Francis Group, LLC.

Research Area(s)

  • Distribution function estimator, Least squares, Linear regression, Truncation, Weighted median

Citation Format(s)

A nonparametric approach to the truncated regression problem. / Tsui, Kwok-Leung; Jewell, Nicholas P.; Wu, C. F. J.
In: Journal of the American Statistical Association, Vol. 83, No. 403, 09.1988, p. 785-792.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review