regression models

REGRESSION MODELS

2.0 ESTIMATION OF PARAMETERS USING LSE METHOD

Using the simple regression model of the form  where  and β1 are the parameters to be estimated and  is the error term.

2.1 BASIC ASSUMPTIONS OF THE MODEL

These are referred to as the basic classical assumptions and include;

Ø Normality of the error term

Ø   Error term has zero mean, i.e E( )=0

Ø   Constant variance (homoscedasticity) i.e E( )=δ2

Ø   Non-auto-regression i.e

Ø The regression model is linear in parameters

Ø Zero covariance between the error term and the explanatory variable i,e  E(Eixi)=0

Ø Non-stochastic explanatory variable. The values of x are fixed in repeated samples.

Ø The number of observations n must be greater than the number of parameters to be estimated. Alternatively, the number of observations n must be greater than the number of explanatory variables.

Ø Variability in x values. The x values in a given sample must not all be the same.

Ø The regression model is correctly specified. Alternatively, there is no specification bias or error in the model used in empirical analysis.

Ø There is no perfect multicollinearity, that is there is no perfect linear relationship among the explanatory variables.

2.2 THE SIGNIFICANCE OF THE STOCHASTIC DISTURBANCE TERM

 

The disturbance term is a surrogate for all those variables that are omitted from the model but that collectively affect Y. The reasons are many.

 

1. Vagueness of theory: The theory, if any, determining the behavior of Y may be, and often is, incomplete.

 

2. Unavailability of data: Even if we know what some of the excluded variables are and therefore consider a multiple regression rather than a simple regression, we may not have quantitative information about these.

 

3. Core variables versus peripheral variables: Assume in a consumption income example that besides income X1, the number of children per family X2, sex X3, religion X4, education X5, and geographical region X6 also affect consumption expenditure. But it is quite possible that the joint influence of all or some of these variables may be so small and at best nonsystematic or random that as a practical matter and for cost considerations it does not pay to introduce them into the model explicitly. One hopes that their combined effect can be treated as a random variable.

 

4. Intrinsic randomness in human behavior: Even if we succeed in introducing all the relevant variables into the model, there is bound to be some“intrinsic” randomness in individual Y’s that cannot be explained no matter how hard we try. The disturbances, may very well reflect this intrinsic randomness.

 

5. Poor proxy variables: Although the classical regression model assumes that the variables Y and X are measured accurately, in practice the data may be plagued by errors of measurement.

 

6. Principle of parsimony: we would like to keep our regression model as simple as possible. If we can explain the behavior of Y “substantially” with two or three explanatory variables and if our theory is not strong enough to suggest what other variables might be included, why introduce more variables? Let the error term represent all other variables. Of course, we should not exclude relevant and important variables just to keep the regression model simple.

 

7. Wrong functional form: Even if we have theoretically correct variables explaining a phenomenon and even if we can obtain data on these variables, very often we do not know the form of the functional relationship between the regressand and the regressors. Is consumption expenditure a linear (invariable) function of income or a nonlinear (invariable) function? If it is the former, Yi = β1 + B2Xi + Ei is the proper functional relationship between Y and X, but if it is the latter, Yi = β1 + β2Xi + β3X2i+Ei may be the correct functional form. In two-variable models the functional form of the relationship can often be judged from the scatter gram. But in a multiple regression model, it is not easy to determine the appropriate functional form, for graphically we cannot visualize scattergrams in multiple dimensions.

 

Using LSE method;

·        From the model , make the error term the subject

·        Form sums of squares

·        Minimize the sums of squares (differentiate with respect to the parameters) and equate the results to zero.

·        Form normal regression equations and solve for the parameters using any methods e.g matrix method.

 

Last modified: Sunday, 10 October 2021, 12:08 PM