-
v1.0
可编辑可修改
第一章
1.
Econometrics
(计量经济学)
:
the
social
science
in
which
the
tools
of
economic
theory,
mathematics,
and statistical
inference are applied to the analysis of economic
phenomena.
the
result of a certain outlook on the role of
economics, consists
of the application
of mathematical statistics to economic data to
lend
empirical
support
to
the
models
constructed
by
mathematical
economics
and
to obtain numerical
results.
2.
Econometric analysis proceeds along the
following
lines
计量经济学
分析步骤
1)
2)
3)
4)
Creating a statement of theory or
hypothesis.
建立一个理论假说
Collecting
data.
收集数据
Specifying the mathematical model of
theory.
设定数学模型
Specifying the statistical, or
econometric, model of theory.
设
立统计或经济计量模型
5)
Estimating the parameters of the chosen
econometric
model.
估计
经济计量模型参数
6)
Checking for
model adequacy : Model specification testing.
核查
模型的适用性:模型设定检验
7)
8)
Testing the hypothesis derived from the
model.
检验自模型的假设
Using the model for prediction or
forecasting.
利用模型进行预测
Step2
:收集数据
1)
Three types of
data
三类可用于分析的数据
Time
series(
时间序列数据
):Collected
over a period of time, are
collected at
regular intervals.
按时间跨度收集得到
1
v1.0
可编辑可修改
2)
Cross-
sectional
截面数据
:Collected over
a period of time, are
collected at
regular intervals.
按时间跨度收集得到
3)
Pooled
data
合并数据(上两种的结合)
Step3
:设定数学模型
1.
2.
plot scatter diagram or
scattergram
write the
mathematical model
Step4
:设立统计或经济计量模型
CLFPR is
dependent variable
应变量
CUNR is independent or explanatory vari
able
独立或解释变量(自变
量)
We give a catchall variable
U to stand for all these neglected
factors
In
linear
regression
analysis
our
primary
objective
is
to
explain
the behavior of the
dependent variable in relation to the behavior of
one
or
more
other
variables,
allowing
for
the
data
that
the
relationship
between them is
inexact.
线性回归分析的主要目标就是解释一个变量(应变
量)与其他一个或多个变量(自变量)只见的行为关系,当然这种关系并非完
全正确
Step5
:估计经济计量模型参数
In short, the estimated
regression line gives the relationship
between average CLFPR and CUNR
< br>简言之,估计的回归直线给出了平均应变
量和自变量之间的关系
< br>
That
is,
on
average,
how
the
dependent
variable
responds
to
a
unit
change in the independent variable.
单位因变量的变化引起的自变量平均
变化量的多少。
Step6
:核查模型的适用
性:模型设定检验
The purpose of
developing an econometric model is not to capture
total
2
v1.0
可编辑可修改
reality,
but just its salient features.
Step7
:检验自模型的假设
Why do we perform hypothesis
testing
We want to find
our whether
the
estimated model makes economic sense
and
whether
the
results
obtains
conform
with
the
underlying
economic
theory.
第二章
1.
The meaning of
regression
(回归)
Regression analysis is concerned with
the study of the relationship
between
one
variable
called
the
dependent
or
explained
variable,
and
one
or more
other variables called independent or explanatory
variables.
2.
Objectives of regression
1)
Estimate
the
mean,
or
average,
and
the
dependent
values
given
the
independent values
2)
Test
hypotheses
about
the
nature
of
the
dependence
-----hypotheses
suggested by
the underlying economic theory
3)
Predict
or
forecast
the
mean
value
of
the
dependent
variable
given
the values of the
independents
4)
One or more of the preceding objectives
combined
3.
Population Regression Line
(<
/p>
PRL
)
In
short,
the
PRL
tells
us
how
the
mean,
or
average,
value
of
Y
is
related
to each value of X in the whole
population
4.
The dependence of Y on X, technically
called the regression of Y on
X.
5.
How do we explain it
A
student
’
s
score,
say,
the
ith
individual,
corresponding
to
a
specific
family income can be expressed as the sum of two
components
3
v1.0
可编辑可修改
1)
The component
can be called the systematic, or deterministic,
component.
2)
May be called
the nonsystematic or random component
6.
What is the
nature of U(stochastic error) term
1)
The
error
term
may
represent
the
influence
of
those
variables
that
are not explicitly
included in the model.
误差项代表了未纳入模型变量
的影响
2)
Some
intrinsic
randomness
in
the
math
score
is
bound
to
occur
that
can not be explained
even we include all relevant variables.
即使模型
包括了决定性数学分数的所有变量,
内在随机性也不可
避免,
这是做任何努力
都无法解释的。
3)
4)
U may also represent errors of
measurement. U
还代表了度量误差
The principle of
Ockham
’
s razor - the
description be kept as
simple
as
possible
until
proved
inadequate
-
would
suggest
that
we
keep
our regression model as simple as possi
ble.
“奥卡姆剃刀原则”
,描述应
该尽可能简单,只要不遗漏重要信息。这表明回归模型应尽可能简单。
7.
How do we
estimate the PRF
(
population
regression function
)
Unfortunately,
in
practice,
We
rarely
have
the
entire
population
in
our
disposal,
often we have only a sample from this
population.
8.
Granted that the SRF is only an
approximation of PRF. Can we find a
method or a procedure that will make
this approximation as close as
possible
SRF
仅仅是
PRF
< br>的近似,
那么能不能找到一种方法使这种近似尽可
能接近
真实呢
9.
Special meaning of
“
< br>linear
”
1)
Linearity in
the variables
变量线性
The
conditional
mean
value
of
the
dependent
variable
is
a
linear
function of the
independent variables
4
v1.0
可编辑可修改
2)
Linearity in
the Parameters
参数线性
The conditional
mean of the
dependent
variable is a
linear
function of
the
parameters,
the
B
’
s;
it
may
or
may
not
be
linear
in
the
variables.
第三章
1.
Unless
we
are
willing
to
assume
how
the
stochastic
U
terms
are
generated, we will not be able to tell
how good an SRF is as an estimate
of
the
true
PRF.
只有假定了随机误差的生成过程,才能判定
SRF
对
PRF
拟合的
是好是坏。
2.
Classical Linear Regression
Model
1)
Assumption 1: The regression model is
linear in the parameters. It
may or may
not be linear in the variables.
回归模型是参数线
性的,但
不一定是变量线性的。
2)
Assumption
2:
The explanatory
variables
X is uncorrelated
with the
disturbance term U.
X
’
s are nonstochastic, U is
stochastic.
解
释变量
X
与扰动误差项
u
不相关
. X
是非随机的,
U
是
随机的。
3)
Assumption 3: Given the value of Xi,
the expected, or mean value
of the
disturbance term U is zero.
给定
Xi
,扰动项的期望或均值为零。
Disturbance U represent all those factors that are
not specifically
introduced in the mod
el
干扰项
U
代表了所有未纳入模型的
影响因素。
4)
Assumption
4:The
variance
of
each
Ui
is
constant,
or
homoscedastic.
U
的方差为常数,或同方差。
Homoscedasticity
(同方差)
:
a.
This
assumption
simply
means
that
the
conditional
distribution
of
each Y population
corresponding to the given value of X has the same
variance.
该假定表明,
与
给定的
X
相对应的每个
Y
的条件分布具有同方差。
b.
The individual
Y values are spread around their mean values with
5
v1.0
可编辑可修改
the same v
ariance.
即每个
Y
值以相同的
方差分布在其均值周围。
5)
Assumption
5:There
is no
correlation
between two error
terms,
this
is the assumption of no-autocorrel
ation.
无自相关假定,即两个误差项
之间不相关。
6)
Assumption 6:The regression model is
correctly specified.
回归模
型是正确假
定的。
There
is
no
specification
bias
or
specification
error
in the
model.
实证分析的模型不存在设定偏差或设定误差。
This
assumption
can
be
explained
informally
as
follows.
An
econometric
investigation
begins
with
the
specification
of
the
econometric model
underlying the phenomenon of interest.
and Standard errors of OLS estimators<
/p>
普通最小二乘估计量的方差与标准
误
:O
ne
immediate
result
of
the
assumptions
introduced
is
that
they
enable
us to estimate the
variances and standard errors of the OLS
estimators
given in Eq. and
.
should know:
Variances of
the estimators
Standard
errors of the estimators
is
the value of
σ
The homoscedastic
σ
is estimated from
formula
Error of the
Regression (SER)
回归标准误
Is simply the standard
deviation of the Y values about the
estimated regression line.
Y
值偏离估计回归的标准差。
of
math function
1)
Interpretation
The standard deviation, or standard
error, is , is a measure of
variability
of b2 from sample to sample.
6
If we can say
that our computed b2 lies within a certain number
v1.0
可编辑可修改
of standard deviation units from the
true B2, we can state with some
confidence
how
good
the
computed
SRF
is
as
an
estimator
of
the
true
PRF.
2
)
Sampling
Distribution
抽样分布
Once
we
determine
the
sampling
distribution
of
our
two
estimators,
the
task of hypothesis testing becomes stra
ightforward.
一旦确定了两个估
计量的抽样分布,
那么假设检验就是举手之劳的事情。
do we use
OLS
The properties of OLS
estimators
The method of OLS
is used popularly not only because it is easy
to
use but
also
because
it has some strong theoretical
properties. OLS
法得到广泛使用,不仅是因为它简单易行,还因为它具
有很强的理论性质。
theorem
< br>高斯
-
马尔科夫定理
Given
the
assumptions
of
the
classical
linear
regression
model
(CLRM),
the
OLS estimators have minimum variance in the class
of linear OLS
estimators are BLUE
(best linear unbiased estimators)
满足古典线性
模
型的基本假定,则在所有线性据计量中,
OLS
估计两具有最小方差性,即
OLS
是最优线性无偏估
计量(
BLUE
)
property
最优线性无偏估计量的性质
1)
B1 and B2 are
linear estimators. B1
和
B2
p>
是线性估计量
2)
They are
unbiased , that is E(b1)=B1, E(b2)=B2. B1
< br>和
B2
是无偏估计
两
3)
The OLS
estimator of the error variance is unbiased.
误差方差的
OLS
估计量是无偏的
< br>
4)
b1 and b2
are efficient
和
B2
是有效估计量
Var(b1)
is
less
than
the
variance
of
any
other
linear
unbiased
estimator
of B1
Var(b2)
is
less
than
the
variance
of
any
other
linear
unbiased
estimator
7
v1.0
可编辑可修改
of
B2
Carlo simulation
蒙特卡洛模拟
Do the experiment at lab
Do it by Excell.
=NORMINV(RAND(),0,2)
Do it by matlab.=
NORMINV(uniform(),MU,SIGMA)
Do it by Stata.
=invnorm(uniform())
l Limit
Theorem
’
s
中心极限定理
If there
is a large number of independent and identically
distributed
(iid) random variables,
then, with a few exceptions , the distribution
of their sum tends to be a normal
distribution as the number of such
variables increases indefinitely.
随着变量个数的无限增加,独立同分布随机变量近似服从正态
分布
U,
the error term represents the influence of all
those forces that
affect
Y
but
are
not
specifically
included
in
the
regression
model
because
there
are
so
many
of
them
and
the
individual
effect
of
any
one
such
force
on Y
may be too minor.
误差项代表了未纳入
回归模型的其他所有因素的影响。
因为在这些影响中,
每
种因素对
Y
的影响都很微弱
< br>
If all these forces are random, if
we let U represent the sum of all
these
forces,
then
by
invoking
the
CLT,
we
can
assume
that
the
error
term
U follows the normal di
stribution.
如果所有这些影响因素都是随机的,用
U
代表所有这些影响因素之和,
那么根据中心极限定理,
可以假定误差项服从正态
分布。
r property of normal
distribution
另一个正态分布的性质
Any linear function of a normally
distributed variable is itself
normally
distributed.
8
v1.0
可编辑可修改
正态变量的性质函数仍服从正态分布。
esis testing
假设检验
Having
known
the
distribution
of
OLS
estimators
b1
and
b2,
we
can
proceed
the
topic of hypothesis testing.
hypothesis
零假设
“
zero
”
null hypothesis is deliberately chosen to find out
whether Y
is related to X al all, which
is also called straw man hypothesis.
之所<
/p>
以选择这样一个假设是为了确定
Y
是否与
X
有关,也称为稻草人假设。
need some formal testing procedure to
reject or receive the null
hypothesis
and make the skeptical guys shut up.
需要正
规的检验过程拒绝
或接受零假设
18.
If our null hypothesis
is B2=0 and the computed b2=, we can find out
the
probability
of
obtaining
such
a
value
from
the
Z,
the
standard
normal
d
istribution.
如果零假设为
B2=0
,计算得到
b2=
,那么根据标准正态分布
Z
,
能够求得获此
b2
值的概率
If the probability is
very small, we can reject
the
null
hypothesis.
如
果这个概率非常小,
则拒绝零假设。
If
the
probability
is larger, say , greater than 10
percent, we may not reject the null
hyp
othesis.
如果这概率比较大,比如大于
10%
,就不拒绝零假设。
don
’
t know the
σ
2
We
must know the true
σ
2, but
we can estimate it by using
?
will happen if
we replace
σ
by its estimator
σ
-hat
b
2
?
B
2<
/p>
2
?
t
n
?
2
?
x
2
i
or
,
more
?
generally
b
p>
2
?
B
2
se
(
b
2
)
t
n
?
2
us
assume
that
α
,
the
level
of
significance
or
the
probability
of
committing
a
type I error, is fixed at 5 percent.
假定α
,显著水平成犯第一类错误的概率为
5%
。
9
v1.0
可编辑可修改
area =
rejection region for 2-sided test
f(
t)
a/2
(1-a)
a/2
t
c
-
t
c
and ball
0
t
a.
This is a 95%
confidence interval for B2
给出了
B2
的一个
95%
的置信区间
。
b.
in repeated applications 95 out of 100
such intervals will include the
true B2
重复上述过程,
100
个这样的区间中将有
95
个包括真实的
B2
。
c.
Such
a
confidence
interval
is
known
as
the
region
of
acceptance
(of
H0)
and
the
area outside the confidence interval is
known as the rejection region (of H0)
用假
设检验的语言把这样的置信区间称为(
H0
的)接受区域,把置
信区间以外的区间
成为(
H0
的)拒绝
区域
24.
回归系数的假设检验
目的:简单线性回归中,检验
X
对
Y
是否真有显著影响
基本概念回顾
:
临界值与概率、大概率事件与小概率事件
相对于显著性水平
?
的临界值为
:
t
?
(单侧)或
t
?
2
(双侧)
*
t
计算的统计量为
:
10
v1.0
可编辑可修改
(
小
概
率
(
大
概
率
事<
/p>
?
1
?
?
事件)
件)
?
t
p>
?
2
0
t
*
t
?
2
统计
量
t
sions
Since this interval does not include
the null-hypothesized value of 0.
因为这个区<
/p>
间没有包括零假设值
0
。
We can reject the null hypothesis that annual
family income
is not related to math S
cores.
所以拒绝假设:家庭年收入对数学
SAT
没有影响。
Put
positively,
income does have a relationship to math scores. <
/p>
换言之,收入确实与
数学
SAT
有关系。
26.A cautionary
note
Although
the
statement
given
is
true,
we
cannot
say
that
the
probability
is
95
percent
that
the
particular
interval
includes
B2,
for
this
interval
is
not
a
random
interval,
it
is
fixed,
therefore,
the
probability
is
either
1
ore
0
that
the
interval
includes
B2.
虽然式子为
真,但不能说某个特定区间式包括真实
B2
的概率为
95%
,因为与式子不同,
式是固定的,而不是一
根随机区间,所以区间包括
B2
的概率为
1
或
can only say that if
we construct 100 intervals like this
interval, 95 out of 100 such intervals will
include the true B2.
我们只能说,如果
建立
100
个像式这样的区间,则有
9
5
个区间包括
真实的
can not
guarantee that this particular interval will
necessarily includes
B2.
并不能保
证某个区间一定有
B2.
test
of significance approach to hypothesis testing
假设检验的显著性检验
方法
Hypothesis
testing
is
that
of
a
test
statistic
and
the
sampling
distribution
of
the
test statistic under the
null hypothesis,
H0.
假设检验方法涉及两个重要的概念检验
11
v1.0
可编辑可修改
统计量和零假设下检验统计量的抽样分布。
The
decision to accept or reject H0 is made
on the basis of the value of the test
statistic obtained from the sample data.
根据从样本数据求得的检验统计量的值决定接受或拒绝零假设。
28.T test
We can
use the t value computed here ad the test
statistic, which follows the t
distribution with (n-2) .
可以计
算出
t
值作为检验统计量,它服从自由度为(
< br>n-2
)的
t
分布。
d
of
arbitrarily
choosing
the
α
value
,
we
can
find
the
p
value
(the
exact
level
of significance) and reject the null hypothesis if
the computed P value is
sufficiently
p>
low.
为了避免选择显著水平的随意性,通常求出
p
值(精确的显著水平)
,如
果计算的
p
值充分小,则拒绝零假设。
sions
In the case
of two-sided t test
双边检验情况中
If the computed |t|,
the absolute
value of t, exceeds the
critical t value at the chosen level of
significance, we
can reject the null hy
pothesis.
如果计算得到的
|t|
值超过临界
t
值,则拒绝零假设。
31.P value
The P
value of that t statistic of is about . t
< br>统计量()的
p
值(概率值)约为。
The smaller the p value, the more confident we are when reject the null
值越小,
在拒绝零假设的时候就越有自信。
Thus if we were to
reject the null hypothesis that
the
true slope coefficient is zero at this P value, we
would be wrong in six out
of ten
thousand occasions.
如果在这个
p
p>
值水平之上拒绝零假设:真实的斜率系数为
0
,
则犯错误的机会有万分之六。
can we computed t
We
first
compute
the
t
value
as
if
the
null
hypothesis
were
that
B2=0,
we
still
get
the
t
t
?
0.
0013
?
0
?
5.4354
0.000245
首先计算在零假设
B2=0
下的
t
值
Since this value
exceeds any of
the
12
v1.0
可编辑可修改
critical
values shown in the preceding table, following the
rules laid down. t
值大与上表给出的任何临界值,附录
D
表
D-2
列出的规
则,
We can
reject the
hypothesis
that annual family income
has no relationship to math Scores.
拒绝零
假设:家庭年
收入对数学
SAT
没有影
响。
good is the fitted
regression line: the coefficient of determination
r2
On the basis of t test
both the estimated intercept and slope
coefficients are
statistically
significant . significantly different from zero)
suggests that the
SRF
seems
to
“
fit
”
the
data
“
reasona
bly
”
well.
根据
t
检验,估计的斜率和结局都
是统计显著的,这说
明样本回归函数式很好地拟合了样本数据。
cient of
determination
Can
we
develop
an
overall
measure
of
“
goodness
of
fit
”
that
will
tell
us
how
well
the estimated
regression line fits the actual Y values
能否建立一个“拟合优度”
的判定规则,从而辨别估计的回归线拟合真实
Y
值的优劣程度呢
Such a measure
has been
developed and is known as the
coefficient of
determination.
称之为判定系数。
Y
i
?
p>
Y
i
?
e
i
nge it
<
/p>
Y
i
?
Y
i
?
e
i
?
Y
i
?
Y
i
?
e
i
Y
i
?
Y
?
Y
i
?<
/p>
Y
?
e
i
(
Y
i
?
Y
)
?
(
Y
i
?
Y
)
?
(
Y
i
?
Y
i
)<
/p>
osition
(
Y
i
?
Y
):
var
iation
?
in
?
Y
i
1
、
3
< br>、
(
Y
i
?
Y
)
:
v
ar
iation
?
in
?
Y
i
exp
lained
?
by
.
p>
X
(
?
Y
i
)
around
2<
/p>
、
?
i
ts
?
mean
?
value
(
note
:
Y
?
Y
)
from
?<
/p>
its
?
mean
?
value
(
Y
< br>i
?
Y
i
):
un
exp
lained
?
or
?
resid
ual
?
var
iation
deviation forms
13
v1.0
可编辑可修改
(
Y
i
?
Y
)
?
(
Y
i<
/p>
?
Y
)
?
(
Y
i
?
Y
i
)
Y
?
Y
?
(
Y
i
?
Y
)
?
(
Y
i<
/p>
?
Y
)
?
(
Y
i
?
Y
i
)
y
i
?
y
i
?
e
i
y
i
?
y
i
?<
/p>
e
i
?
(
Y
i
?
Y
)
?
e
i
?
(
b
1
?
b
2
X
i
)
?
(
b<
/p>
1
?
b
2
X
)
?
e
i
?
b
2
(
X
i
?
X
)
?
e
i
2
、
1
、
y
i
?<
/p>
b
2
x
i
?
e
i
?
y
i
?
b
2
x
i
?
e
i
both sides
and sum
?
y
i
2
?
?
y
i
?
?
e
i
2
?
y<
/p>
?
y
2
i
2
2
?
b
2
x
i
?
?
e
i
2
2
2
i
=the total
variation of the actual Y values about their
sampling mean Y bar,
which may be
called the total sum of squares (TSS)
总平
方和,真实
Y
值围绕其均值的
总变异<
/p>
?
y
=The
total variation of the estimated Y values about
their mean value, Y hat
i
2
bar, which may
be called appropriately the sum of squares due to
regression ., due
to
the
explanatory
variables),
or
simply
called
the
explained
sum
of
squares
(ESS)<
/p>
解释平方和,估计的
Y
值围绕气均值的变
异,也称回归平方和(由解释变量解释的部分)
simply
TSS
?
ESS
?
RSS
The
total
variation
in
the
observed
Y
values
about
their
mean
value
can be partitioned into
two
parts, one attributable to the regression
line and the
other to random
forces,
because not
all
actual Y observations lie on the fitted
值
与其均值的总离差可以分解为两部分:
一部分归于回归线,
另一部分归于随机因素,
因为不
是所有
的真实观察值
Y
都落在你和直线上。
vs RSS
a.
If
the
chosen
SRF
fits
the
data
quite
well,
ESS
should
be
much
larger
than
RSS.<
/p>
如果选择的
SRF
很好的拟合了样本数据
,则
SEE
远大于
RSS
。
b.
If the SRF fits the data poorly RSS
will be much larger than ESS.
如果
< br>SRF
拟
合的不好,则
RSS<
/p>
远大于
ESS
。
us define
定义
14
v1.0
可编辑可修改
r
2
?
ESS
TSS
43.R2
样本判定系数
R2
measures
the
proportion
or
percentage
of
the
total
variation
in
Y
explained
by the regression model
样本判定系
数度量了回归模型对
Y
变异的解释比例(或百分
比)
R2 is the
coefficient of determination and is the most
commonly used measure
of the goodness
of fit of a regression line.
样本判定系数通常用来度
量回归线的
拟合优度。
ties
of R2
a.
it is a non-negative
quantity.
非负性
b.
its limits are
0
≤
R2
≤
1 since a part (ESS) cannot
be greater than the whole
(TSS).
0
≤
R2
≤
1
,因为部分(
ESS
)不可能大于整体(
TSS
)
。
An
R2 of
1
means
a
“
perfect
fit
”
for the
entire variation in Y is explained by the regressi
on.
若
R2=1
,则表
示完全拟合,即线性模型完全解释
Y
的变异。
An
R2
of
zero
means
no
relationship
between
Y and X whatsoever.
若
< br>R2=0
,则表示
Y
与
X
之间无任何关系。
ing the results
Y
i
?
432.4138
?
0.0013
X
i
se
?
(16.9061)(0.000245
)
t
?
(25.5774)(5.43
54)
?
?
?
?
r
2
?
0.
7849
p
?
value
?
(5.85
?
10
?
9
)(0.0006)
?
?
?
d
.
f
.
?
8<
/p>
ation
a.
The figures in
the first set of parentheses are the estimated
standard errors
(se) of the estimated
regression coefficients.
第一行括号内的数值表示估计回归
系数的标准误
b.
Those
in
the
second
set
of
parentheses
are
the
estimated
t
value
computed
under
the null hypothesis
that the population value of each regression
coefficient
15
v1.0
可编辑可修改
individually
is
values
are
simply
computed
the
ratios
of
the
estimated
coefficient to
their standard errors.
c.
第二行括号内的数值表示在零假
设下(每个回归系数的真实值为零)
,根据式估计的
t
值(即估计的系数与其标准误之比)
d.
those in the
third set of parentheses are p values of the
computed t values.
e.
第三行括号内的数值表示获得<
/p>
t
值的
p
值。<
/p>
a matter of
convention
From now on , if
we do not specify a specific null hypothesis, then
we will assume
that it is the zero null
hypothesis.
从现在起,如果没有设定特殊的零假设,习惯地规
定零假设为:总体参数为零。
48.P
value
By quoting the P
values we can determine the exact level of
significance of the
estimated
t
value.
通过列出的
p
值能够确定
t
值的
精确显著水平。
The
lower
the
P
value,
the greater the evidence against the
null hypothesis, the lower likelihood the
coefficient is
值越低,拒绝假设的证据就越充分。
49.A warning
When
deciding
whether
to
reject
or
not
reject
a
null
hypothesis,
determine
beforehand
what level of the p value you are
willing to accept and then compare the computed
p value with the critical P value.
当拒绝或不拒绝原假设时,需要鱼线确定一个接受的
p
值水平(即临界
p
值)
,然后把计
算的
p
值进行比较。
If
the
computed
P
value
is
smaller
than the critical P value, the null
hypothesis can be rejected.
如果计算的
p
值小
于临界
p
值,则拒绝原假设。
If it is greater than the
critical P value the null
hypothesis
may not be rejected.
如果计算的
p
p>
值大雨临界
p
值,则不能拒绝原假设。
p>
term: normality
test
Our statistical testing
procedure is based on the assumption that the
error term
Ui
is
normally
distributed.
这一统计检验过程是建立在误差项
ui
服从正态分布的
基础上。
ity test: JB test
雅克
-
贝拉检验
16
v1.0
可编辑可修改
n
2
(
K
?
3)
2
JB
?
[
S
?
]
6<
/p>
4
S
represents skewness and K represents kurtosis S
p>
为偏度,
K
为峰度
The JB statistic follows
the Chi-square distribution with 2 . Asymptoticall
y.
在正态性假设下
,
给出的
JB
统计量渐近服从自由度为
2
的卡方分布。
If
the
computed
Chi-
square
value
exceeds
the
critical
Chi-
square
value
for
2
.
at the chosen
level of significance, we reject the null
hypothesis of normal
distribution.
如果在选定的显著水平下,根据式计算的卡方值超过临界的卡方值,则
拒绝
正态分布的零假设
If it does not exceed the
critical Chi-square value, we
may not
reject the null hypothesis.
如果没有超过临界的卡方值
,则不能拒绝零假
设。
第四章
1
、
Why should we
introduce multiple regression model
为什么介绍多元回
归模型
Because multiple influences .,
variable) may affect the dependent
variable.
2
、
The Three-
variable regression
model
三变量线性回归模型
①
The three-
variable PRF to its non-stochastic form
:三变量
PRF
的非随机形式
E
(
Y
t
)
?
B
1
?
B
2
X
2
t
?
B
3
X
3
t
E<
/p>
(
Y
t
)
:
The
conditional
mean
value
of
Yt,
conditional
upon
the
given
or
fixed
values of the
variables X2 and X3
给定
X2
、
X3
取值下
Y
的条件均值
We obtain the
average or mean value of Y for the fixed values of
X
variables.
给定解释
变量
X
取值条件下,得到的
Y
的均值
②
The three-variable PRF to its
stochastic form
三变量
PRF
的随机
形式
17
v1.0
可编辑可修改
Y
t
?
B
1
?
B
2
X
2
t
?
B
3
X
3
t<
/p>
?
u
t
?
E
p>
(
Y
t
)
?
u
t
Y
t
?
E
< br>(
Y
t
)
?
u
t
Any individual Y value can be expressed as the sum
of
two components
Any individual
Y value can be expressed as the sum of two
components
:
任何一个<
/p>
Y
值可以表示成两部分之和
a
systematic
or
det
erministic
,
components
mean value
E
(
Y
t
)
系统成分或确定性成分
E
(
Y
t
)
(
B
1
?
B
2
X
2
t
?
B
3<
/p>
X
3
t
)
,
Which
is
simply
its
也就是
Y
的均值
(<
/p>
B
1
?
B
2
X
2
t
?
B
3
X
3
t
)
Ut , which is
the nonsystematic or random component determined
by
factors other than X2 and
X3.
非系统成分或随即成分
Ut
,由除
X2,X3
以外的因素决定。
3
、
The meaning of
partial regression
coefficient
偏回归系数的含义
The
regression
coefficients
B2
and
B3
are
known
as
partial
regression
or
partial slope coefficients.
B2,B3
称为偏回归系数或偏斜率系数
①
The meaning of
Partial regression coefficient is as follows: B2
measures the change in the mean value
of Y, E(Y), per unit change in X2,
holding the value of X3 constant. B2
度量了在
X3
保持不变的情况下,
X2
单位变动引起
Y
均值
E(Y)
的变化量。
②
Likewise,B3
measures
change
in
the
mean
value
of
Y
per
unit
change
in X3 holding the value of X2 constant.
同样的,
B2
度量了
< br>X2
保持不变的
情况下,
X3<
/p>
单位变动引起
Y
均值
E(Y)
的变化量。
③
Uniqueness
:特殊性质
In the multiple regression
model
在多元回归模型中
we
want to find out what part of the change in the
average value of Y
18
v1.0
可编辑可修改
can be
directly attributable to X2 and what part to X3.
p>
我们想要知道的
是
Y
均值的变动有多大比例“直接”来源于
X2
,多大比例“直接
”来源于
X3
。
A example
:
E
(
Y
t
)
< br>?
15
?
1.2
X
2
t
?
0.8
X
3
t
The meaning
of B2
B2= indicates that the
mean value of
Y decrease by
per
unit increase in
X2 when
X3 is held constant, in this example it is held
constant at the
value of 10.
B2
是斜率,表示当
X3
为常数时,
X2
每增加
1<
/p>
个单位,
Y
的均值将减少个单位—
—本例中,
X3
为常数
10
The meaning
of B3
Here the slope
coefficient B3= means that the mean value of Y
increase
by per unit increase in X3
when X2 is held constant. Here it is held
constant at the value of 5.
斜率
B3=
,表示
X2
为常量时,
X3
每增加
1
个单位,
Y
的平均
值增加个单位,
(这
里假设
X2
等于
5
)
4
、
In
short
,
A
partial
regression
coefficient
reflects
the
(partial)
effect
of one explanatory
variable on the mean value of the dependent
variable
when
the
values
of
other
explanatory
variables
included
in
the
model
are
held
constant.
总之,
偏回归系
数反映了当模型中其他解释变量为常量时,
某个解释变量对应变
量均值的影响。
5
、
uniqueness
This
unique
feature
of
multiple
regression
enables
us
not
only
to
include
more than one explanatory variable in
the model but also to
“
isola
te
”
or
“
disentangel
”
the effect of each X variable on Y from the other
X
variables included in the
model.
19
v1.0
可编辑可修改
多元回归的这个独特性
质不但能够引入多个解释变量,
而且能够
“分离”
出每个
解释变量
X
对应变量
Y
的影响。
6
、
Assumptions of
the multiple linear regression model
多元线
性回归模型
的若干假定
In
order
to
estimate
the
regression
coefficients
of
the
multiple
regression
model,
we will continue to operate within the
framework of the classical linear
regression
model
(CLRM)
to
use
the
ordinary
least
squares
(OLS)
to
estimate
the
coefficients.
为了对多元回归模型的参数进行估计,我们沿用古典线性回归模型的基
本框架,并利用普
通最小二乘法(
OLS
)进行参数估计。
A
The regression model
is linear in the parameters and is correctly
specified.
X2 and X3 are uncorrelated with the
disturbance term U.
If X2 and X3 are non-stochastic, this assumption
is automatically
fulfilled.
The error term U has a
zero mean value
E
(
u
i
)
?
0<
/p>
Var
(
u<
/p>
i
)
?
?
2
Homoscedasticity, the variance of U is
constant.
No
auto
correlation
exists
between
the
error
term
Ui
and
Uj
No exact
collinearity exists between X2 and X3
Cov
(
u
i
,
u
p>
j
)
?
0,
i
?
j
There
is
no
exact
linear
relationship
between
the
two
explanatory
variables.
Cov
(
X
2
,
X
3
)
?
0
The
error
term
U
follows
the
normal
distribution
with
mean
zero
and
variance
σ
2
u
i
N
p>
(0,
?
2
)
p>
7
、
Why
we make assumptions
We make
these assumptions to facilitate the development of
the subject.
20
v1.0
可编辑可修改
为了确保能够使用
OLS
法估计模型的参数
8
、
No
Multicollinearity
:无多重共线性
There
is no exact linear
relationship between the explanatory variables X2
and
is the assumption of no
collinearity or no multicollinearity.
<
/p>
解释变量
X2,X3
不存在严格的共线性
,
这个假定也称为无共线性或者无多重共线性假设
No perfect collinearity means that a
variable, say, X2, cannot be expressed as
an exact linear function of another var
iable
无完全共线性通俗的解释是,变量
X2
不能表示为另一变量
X3
的线性函数
9
、
Troublesom
e
This is one equation with
two unknowns we need two (independent) equations
to obtain unique estimates of B2 and
B3
(we have only one A, but
we have two B to solve.)
Now even
if we can estimate
and obtain an estimate of A, there is no
way that
we can get
individual estimates of B2 and B3 from the
estimated A.
We
cannot
asses
the
individual
effect
of
X2
and
X3
on this
is
hardly
surprising,
for we really do not have two
independent variables in the model.
不能估计解释变量
X2,X3
各自对应变量<
/p>
Y
的影响,
没什么好奇怪的,因为在模型
中确实
没有两个独立的变量。
10
、
OLS
principle
最小二乘法
The
OLS
principle
chooses
the
value
of
the
unknown
parameters
in
such
a
way
that
e
the
residual sum of squares (RSS)
?
2
t
As
small as possible.
11
、
BLUE
:
Under
assumed
conditions
the
OLS
estimators
are
best
linear
unbiased
estimators
在古典线性回归模型的基本假定下,双变量模型的
OLS
估计量是最优无偏估计量
Each regression coefficient estimated
by OLS is linear and unbiased.
每一个回归系数都是线性的和无偏的
21
v1.0
可编辑可修改
On the
average it coincides with the true
value.
平均而言,他与真实值一致
Among all such linear unbiased
estimators, the OLS estimators have the least
possible variance so that the true
parameter can be estimated more accurately
than by competing linear unbiased estim
ators.
在所有线性无偏估计量中,
OLS
估计
量具有最小方差性,所以,
OLS
估计量比其他线性无偏估计量更准确地估计了真实的参
数值。
< br>
In short, the OLS estimators are ef
ficient.
简言之,
OLS
是最
有效的
12
、
In two-
variable case we saw that r^2 measures the
goodness of fit of the
fitted sample
regression line (SRL) r^2
度量了样本回归直线(
SRL
)的拟合优度
13
、
In three-
variable case
,
We would like
to know the proportion of the total
variation in Y
(
e
?
yt2) explained
by X2 and X3 jointly.
在三变量模型中,我们
< br>t
用多元判定系数度量
X2
和<
/p>
X3
对应变量
Y
变动的联合解释比例
14
、
In multiple
regression model, R can be interpreted as the
degree of linear
association between Y
and all the X variables jointly.
15
、
Antique clock
auction revision
(
Eviews
)
Let Y= auction
price, X2= age of clock, X3= number of
bidders
Y
i
?
?
1336.049
?
p>
12.7413
X
2
i
?
85.764
X
3
i
se
?
< br>(175.2725)
?
(0.9123)
?
?
(8.8019)
t<
/p>
?
(
?
7.62
26)
?
?
(13.9653)
?
?
(9.7437)
p
?
(0.0000)
*
?
?
(0.0000)
*
?
?
(0.0000)
*
R
2
?
0.8906,
?
?
?
?
F
?
118.0585
16
、
Interpretation
of the results
回归结果的解释:
The
interpretation of the slope
coefficient
of
about (b2)
means
that
holding
other
variables
constant,
if
the
age
of the clock goes up by a year, the
average price of the clock will go up by about
$$.
17
、
The test of
significance approach
显著性检验法
22