TIME SERIES WITH STATA

/* Make sure to set your data as time series before using tin/twithin */ Change “?” with the correct format: w week, m monthly, q quarterly, h half, y yearly.. /*Change ‘datevar’ with yo

Trang 2

Date variable

If you have a format like ‘date1’ type

-STATA 10.x/11.x:

gen datevar = date(date1,"DMY", 2012)

format datevar %td /*For daily data*/

-STATA 9.x:

gen datevar = date(date1,"dmy", 2012)

-STATA 10.x/11.x:

gen datevar = date(date2,"MDY", 2012)

-STATA 9.x:

gen datevar = date(date2,"mdy", 2012)

destring year month day, replace

gen datevar1 = mdy(month,day,year)

format datevar1 %td /*For daily data*/

Trang 3

Date variable (cont.)

If the original date variable is string (i.e color red):

gen week= weekly(stringvar,"wy")

gen month= monthly(stringvar,"my")

gen quarter= quarterly(stringvar,"qy")

gen half = halfyearly(stringvar,"hy")

gen year= yearly(stringvar,"y")

If the components of the original date are in different numeric variables (i.e color black):

gen daily = mdy(month,day,year)

gen week = yw(year, week)

gen month = ym(year,month)

gen quarter = yq(year,quarter)

gen half = yh(year,half-year)

To extract days of the week (Monday, Tuesday, etc.) use the function dow()

gen dayofweek= dow(date)

Replace “date” with the date variable in your dataset This will create the variable ‘dayofweek’ where 0 is ‘Sunday’, 1 is

‘Monday’, etc (type help dow for more details)

To specify a range of dates (or integers in general) you can use the tin() and twithin() functions tin() includes the

first and last date, twithin() does not Use the format of the date variable in your dataset

/* Make sure to set your data as time series before using tin/twithin */

Change “?” with the correct format: w (week), m (monthly), q (quarterly), h (half), y (yearly).

NOTE: Remember to format the date variable accordingly After creating it type:

format datevar %t? /*Change ‘datevar’ with your date variable*/

Change “?” with the correct format: w (week), m (monthly), q (quarterly), h (half), y (yearly).

Trang 4

Date variable (example)

Time series data is data collected over time for a single or a group of variables For this kind of data the first thing

to do is to check the variable that contains the time or date range and make sure is the one you need: yearly, monthly, quarterly, daily, etc

The next step is to verify it is in the correct format In the example below the time variable is stored in “date” but it is a string variable not a date variable In Stata you need to convert this string variable to a date variable.*

A closer inspection of the variable, for the years 2000 the format changes, we need to create a new variable with

a uniform format Type the following:

use http://dss.princeton.edu/training/tsdata.dta

gen date1=substr(date,1,7)

gen datevar=quarterly(date1,"yq")

format datevar %tq

browse date date1 datevar

For more details type

help date

*Data source: Stock & Watson’s companion materials

Trang 5

From daily/monthly date variable to quarterly

use "http://dss.princeton.edu/training/date.dta", clear

*Quarterly date from daily date

gen datevar=date(date2,"MDY", 2012) /*Date2 is a string date variable*/

format datevar %td

gen quarterly = qofd(datevar)

format quarterly %tq

*Quarterly date from monthly date

gen month = month(datevar)

Trang 6

From daily to weekly and getting yearly

use "http://dss.princeton.edu/training/date.dta", clear

gen datevar = date(date2, "MDY", 2012)

* From daily to yearly

gen year1 = year(datevar)

* From quarterly to yearly

gen year2 = yofd(dofq(quarterly))

* From weekly to yearly

gen year3 = yofd(dofw(weekly))

Trang 7

PU/DSS/OTR

Once you have the date variable in a ‘date format’ you need to declare your data as time series in order to

use the time series operators In Stata type:

If you have gaps in your time series, for example there may not be data available for weekends This

complicates the analysis using lags for those missing dates In this case you may want to create a continuous time trend as follows:

gen time = _n

Then use it to set the time series:

tsset time

In the case of cross-sectional time series type:

sort panel date

by panel: gen time = _n

xtset panel time

Trang 8

Use the command tsfill to fill in the gap in the time series You need to tset, tsset or xtset the data before using tsfill In the example below:

tset quarters

tsfill

Type help tsfill for more details

Filling gaps in time variables

7

Trang 9

Subsetting tin/twithin

With tsset (time series set) you can use two time series commands: tin (‘times in’, from a to b) and

twithin (‘times within’, between a and b, it excludes a and b) If you have yearly data just include the years

175 2000q3 4

174 2000q2 3.933333 datevar unemp list datevar unemp if twithin(2000q1,2000q4)

176 2000q4 3.9

175 2000q3 4

174 2000q2 3.933333

173 2000q1 4.033333 datevar unemp list datevar unemp if tin(2000q1,2000q4)

/* Make sure to set your data as time series before using tin/twithin */

tsset date

regress y x1 x2 if tin(01jan1995,01jun1995)

regress y x1 x2 if twithin(01jan2000,01jan2001)

Trang 10

Merge/Append

See

http://dss.princeton.edu/training/Merge101.pdf

Trang 11

Another set of time series commands are the lags, leads, differences and seasonal operators

It is common to analyzed the impact of previous values on current ones

To generate values with past values use the “L” operator

Lag operators (lag)

generate unempL1=L1.unemp

generate unempL2=L2.unemp

list datevar unemp unempL1 unempL2 in 1/5

In a regression you could type:

(2 missing values generated) generate unempL2=L2.unemp(1 missing value generated) generate unempL1=L1.unemp

Trang 12

To generate forward or lead values use the “F” operator

Lag operators (forward)

(2 missing values generated) generate unempF2=F2.unemp

(1 missing value generated) generate unempF1=F1.unemp

regress y x F1.x F2.x

or regress y x F(1/5).x 11

Trang 13

To generate the difference between current a previous values use the “D” operator

Lag operators (difference)

generate unempD1=D1.unemp /* D1 = y t – yt-1 */

generate unempD2=D2.unemp /* D2 = (y t – yt-1) – (yt-1 – yt-2) */

list datevar unemp unempD1 unempD2 in 1/5

(2 missing values generated) generate unempD2=D2.unemp (1 missing value generated) generate unempD1=D1.unemp

Trang 14

To generate seasonal differences use the “S” operator

Lag operators (seasonal)

generate unempS1=S1.unemp /* S1 = y t – yt-1 */

generate unempS2=S2.unemp /* S2 = (y t – yt-2) */

list datevar unemp unempS1 unempS2 in 1/5

(2 missing values generated) generate unempS2=S2.unemp

(1 missing value generated) generate unempS1=S1.unemp

13

Trang 15

To explore autocorrelation, which is the correlation between a variable and its previous values,

use the command corrgram The number of lags depend on theory, AIC/BIC process or

experience The output includes autocorrelation coefficient and partial correlations coefficients

used to specify an ARIMA model

corrgram unemp, lags(12)

Correlograms: autocorrelation

12 0.3219 0.0745 949.4 0.0000

11 0.3594 -0.1396 927.85 0.0000

10 0.3984 -0.1832 901.14 0.0000

9 0.4385 0.1879 868.5 0.0000

8 0.4827 0.0744 829.17 0.0000

7 0.5356 -0.0384 781.77 0.0000

6 0.5892 -0.0989 723.72 0.0000

5 0.6473 0.0836 653.86 0.0000

4 0.7184 0.0424 569.99 0.0000

3 0.8045 0.1091 467.21 0.0000

2 0.8921 -0.6305 339.02 0.0000

1 0.9641 0.9650 182.2 0.0000

LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] -1 0 1 -1 0 1

corrgram unemp, lags(12)

AC shows that the

correlation between the

current value of unemp and

its value three quarters ago

is 0.8045 AC can be use to

define the q in MA(q) only

in stationary series

PAC shows that the correlation between the current value of unemp and its value three quarters ago

is 0.1091 without the effect

of the two previous lags

PAC can be used to define

the p in AR(p) only in

stationary series

Box-Pierce’ Q statistic tests the null hypothesis that all

correlation up to lag k are

equal to 0 This series show significant autocorrelation

as shown in the Prob>Q

value which at any k are less

than 0.05, therefore rejecting the null that all lags

Graphic view of AC which shows a slow decay in the trend, suggesting non-stationarity

Định dạng
Số trang	32
Dung lượng	883,1 KB
File đính kèm	116. TIME SERIES WITH STATA.rar (834 KB)