Econometrics Beat: Dave Giles' Blog: 2015

Wednesday, December 30, 2015

The Econometric Game, 2016

I like to think of The Econometric Game as the World Championship of Econometrics.

There have been 17 annual Econometric Games to date, and some of these have been featured previously in this blog. For instance in 2015 there were several posts, such as this one. You'll find links in that post to earlier posts for other years.

I also discussed the cases that formed the basis for the 2015 competition here.

In 2016, the 18th Econometric Game will be held at the University of Amsterdam between 6 and 8 April.

The competing teams will be representing the following universities:

Requests I Ignore

About six months ago I wrote a post titled, "Readers' Forum Page".

Part of my explanation for the creation of the page was as follows:

Job Market for Economics Ph.D.'s

In a post in today's Inside Higher Ed, Scott Jaschik discusses the latest annual jobs report from the American Economic Association.

Ne notes:

"A new report by the American Economic Association found that its listings for jobs for economics Ph.D.s increased by 8.5 percent in 2015, to 3,309. Academic jobs increased to 2,458, from 2,290. Non-academic jobs increased to 846 from 761."

(That's an 11.1% increase for non-academic jobs, and a 7.3% increase for academic positions.)

The bounce-back in demand for graduates since 2008 is impressive:

"Economics, like most disciplines, took a hit after 2008. Between then and 2010, the number of listings fell to 2,285 from 2,914. But this year's 3,309 is greater not only than the 2008 level, but of every year from 2001 on. The number of open positions also far exceeds the number of new Ph.D.s awarded in economics."

And here's the really good news for readers of this blog:

"As has been the case in recent years, the top specialization in job listings is mathematical and quantitative methods."

Monday, December 28, 2015

Correlation Isn't Necessarily Transitive

If X is correlated with Y, and Y is correlated with Z, does it follow that X and Z are correlated?

No, not necessarily. That is, the relationship of correlation isn't necessarily transitive.

In a blog post from last year the Fields Medallist, Terrence Tao, discusses the question: "When is Correlation Transitive?", and provides a thorough mathematical answer.

He also provides this simple example of correlation intransitivity:

This is something for students of econometrics to keep in mind!

© 2015, David E. Giles

Sunday, December 27, 2015

Bounds for the Pearson Correlation Coefficient

The correlation measure that students typically first encounter is actually Pearson's product-moment correlation coefficient. This coefficient is simply a standardized version of the covariance between two random variables (say, X and Y):

ρ_XY = cov.(X,Y) / [s.d.(X) s.d.(Y)] , (1)

where "s.d." denotes "standard deviation".

In the case of sample data, this formula will be:

ρ_XY = Σ[(X_i - X*)(Y_i - Y*)] / {[Σ(X_i - X*)²][Σ(Y_i - Y*)²]}^1/2 , (2)

where the summations run from 1 to n (the sample size); and X* and Y* are the sample averages of the X and Y variables.

Scaling the covariance in this way to create the correlation coefficient ensures that (i) the latter is unitless; and (ii) it takes values in the interval [-1, +1]. The first of these two properties facilitates meaningful comparisons of correlations involving data measured in different units. The second property provides a metric that enables us to think about the "degree" of correlation in a meaningful way. (In contrast, a covariance can take any real value - there are no upper or lower bounds.)

Result (i) above is obvious. Result (ii) can be established in a variety of ways.

(a) If you're familiar with the Cauchy-Schwarz inequality, the result that -1 ≤ ρ ≤ 1 is immediate.

(b) If you like working with vectors, then it's easy to show that ρ is the cosine of the angle between two vectors in the X-Y plane. As cos(θ) is bounded below by -1 and above by +1 for any θ, we have our result for the range of ρ right away. See this post by Pat Ballew for access to the proof.

(c) However, what about a proof that requires even less background knowledge? Suppose that you're a student who knows how to solve for the roots of a quadratic equation, and who knows a couple of basic results relating to variances. Then, proving that -1 ≤ ρ ≤ 1 is still straightforward:

Let Z = X + tY, for any scalar, t. Note that var.(Z) = t²var.(Y) +2tcov.(X,Y) + var.(X) ≥ 0.

Or, using obvious notation, at² + bt + c ≥ 0

This implies that the quadratic must have either one real root or no real roots, and this in turn implies that b² - 4ac ≤ 0.

Recalling that a = var.(Y); b = 2cov.(X,Y); and c = var.(X), some simple re-arrangement of the last inequality yields the result that -1 ≤ ρ ≤ 1.

A complete version of this proof is provided by David Darmon, here.

Saturday, December 26, 2015

Gretl Update

The Gretl econometrics package is a great resource that I've blogged about from time to time. It's free to all users, but of a very high quality.

Recently, I heard from Riccardo (Jack) Lucchetti - one of the principals of Gretl. He wrote:

"In the past, you had some nice words on Gretl, and we are grateful for that.

Your recent post on HEGY made me realise that you may not be totally aware of the recent developments in the gretl ecosystem: we now have a reasonably rich and growing array of "addons". Of course, being a much smaller project than, say, R, you shouldn't expect anything as rich and diverse as CRAN, but we, the core team, are quite pleased of the way things have been shaping up."

The HEGY post that Jack is referring to is here, and he's quite right - I haven't been keeping up sufficiently with some of the developments at the Gretl project.

There are now around 100 published Gretl "addons", of "function packages". You can find a list of those currently supported here. By way of example, these packages include ones as diverse as Heteroskedastic I.V. Probit; VECM for I(2) Analysis; and the Moving Blocks Bootstrap for Linear Panels.

If you go to this link you'll be able to download the Gretl Function Package Guide. This will tell you everything you want to know about using function packages in Gretl, and it also provides the information that you need if you're thinking of writing and contributing a package yourself.

Congratulations to Jack and to Allin Cottrell for their continuing excellent work in making Grelt available to all of us!

Tuesday, December 22, 2015

End-of-Year Reading

Wishing all readers a very special holiday season!

Agiakloglou, C., and C. Agiropoulos, 2016. The balance between size and power in testing for linear association for two stationary AR(1) processes. Applied Economics Letters, 23, 230-234.
Allen, D., M. McAleer, S. Peiris, and A. K. Singh, 2015. Nonlinear time series and neural-network models of exchange rates between the US dollar and major currencies. Discussion Paper No. 15-125/III, Tinbergen Institute.
Basu, D., 2015. Asymptotic bias of OLS in the presence of reverse causality. Working Paper 2015-18, Department of Economics, University of Massachusetts, Amherst.
Giles, D. E., 2005. Testing for a Santa Claus effect in growth cycles. Economics Letters, 87, 421-426.
Kim, J., and I Choi, 2015. Unit roots in economic and financial time series: A re-evaluation based on enlightened judgement. MPRA Paper No. 68411.
Triacca, U., 2015. A pitfall in using the characterization of Granger non-causality in vector autoregressive models. Econometrics, 3, 233-239.

Wednesday, December 9, 2015

Seasonal Unit Root Testing in EViews

When we're dealing with seasonal data - e.g., quarterly data - we need to distinguish between "deterministic seasonality" and "stochastic seasonality". The first type of seasonality is what we try to remove when we "seasonally adjust" the series. It's also what we're trying to account for when we include seasonal dummy variables in a regression model.

On the other hand, "stochastic seasonality" refers to unit roots at the seasonal frequencies. This is a whole different issue, and it's been well researched in the time-series econometrics literature.

This distinction is similar to that between a "deterministic trend" and a "stochastic trend" in annual data. The former can be removed by "de-tending" the series, but the latter refers to a unit root (at the zero frequency).

The most widely used procedure for testing for seasonal unit roots is that proposed by Hylleberg et al. (HEGY) (1990), and extended by Ghysels et al. (1994).

In my graduate-level time-series course we always look at stochastic seasonality. Recently, Nicolas Ronderos has written a new "Add-in" for EViews to make it easy to implement the HEGY testing procedure (see here). This will certainly save some coding for EViews users.

Of course, stochastic seasonality can also arise in the case of monthly data - this is really messy - see Beaulieu and Miron (1993). In the case of half-yearly data, the necessary theoretical framework and critical values are developed and illustrated by Feltham and Giles (2003)

And if you have unit roots at the seasonal frequencies in two or more time-series, you might also have seasonal cointegration. The seminal contribution relating to this is by Engle et al. (1993), and an short empirical application is provided by Reinhardt and Giles (2001)

I plan to illustrate the application of seasonal unit root and cointegration tests in a future blog post.

(Also, note the comment from Jack Lucchetti, below, that draws attention to a HEGY addon for Gretl, written by Ignacio Diaz Emparanza.)

References

Beaulieu, J. J., and J. A. Miron, 1993. Seasonal unit roots in aggregate U.S. data. Journal of Econometrics, 55, 305-328.

Engle, R. F., C. W. J. Granger, S. Hyleberg, H. S. Lee, 1993. Seasonal cointegration: The Japanese consumption function. Journal of Econometrics, 55, 275-298.

Feltham, S. G. and D. E. A. Giles, 2003. Testing for unit roots in semi-annual data. in D.E.A. Giles

(ed.), Computer-Aided Econometrics. Marcel Dekker, New York, 175-208. (Pre-print here.)

Ghysels, E., H. S. Lee, and J. Noh, 1994. Testing for unit roots in seasonal time series: Some theoretical extensions and a Monte Carlo investigation. Journal of Econometrics, 62, 415-442.

Hylleberg, S., R. F. Engle, C. W. J. Granger, and B. S. Yoo, 1990. Seasonal integration and cointegration. Journal of Econometrics, 44, 215-238.

Reinhardt, F. S. and D. E. A. Giles, 2001. Are cigarette bans really good economic policy?. Applied Economics, 33, 1365-1368. (Pre-print here.)

Friday, December 4, 2015

Linear Regression and Treatment Effect Heterogeneity

I received an email from Tymon Słoczyński (Warsaw School of Economics), about a recent paper of his, titled, "New Evidence on Linear Regression and Treatment Effect Heterogeneity". Tymon wrote:

"I have recently written a new paper, which I believe that you might find interesting, given some of your blog posts that I have read.

This paper is available here (as an IZA DP No. 9491): http://ftp.iza.org/dp9491.pdf; or from my website: http://akson.sgh.waw.pl/~tslocz/Sloczynski_paper_regression.pdf.

This paper implicitly criticizes the standard approach in reduced-form applied microeconomics to use very simple linear models and estimate them using OLS (or 2SLS). I provide a new interpretation of the least squares estimand in the constant-effects linear regression model when the assumption of constant effects is violated (so there is, in fact, "treatment effect heterogeneity"). This new interpretation is very pessimistic: in particular, I prove that the weight that is being placed by OLS on the effect on each group ("treated" or "controls") is inversely related to the proportion of this group. This property might have severe consequences for applied work, and I demonstrate this via a replication of two recent papers from the American Economic Review."

Tymon's paper is, indeed, very interesting. I recommend that you read it. It should serve as a 'wake-up call' to some of our empirical micro. friends!

Sunday, November 15, 2015

November Reading

Somewhat belatedly, here is some suggested reading for this month:

Al-Sadoon, M. M., 2015. Testing subspace Granger causality. Barcelona GSE Working Paper Series, Working Paper nº 850.
Droumaguet, M., A. Warne, & T. Wozniak, 2015. Granger causality and regime influence in Bayesian Markov-switching VAR's. Department of Economics, University of Melbourne.
Foroni, C., P. Guerin, & M. Marcellino, 2015. Using low frequency information for predicting high frequency variables. Working Paper 13/2015, Norges Bank.
Hastie, T., R. Tibshirani, & J. Friedman, 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd. ed.). Springer, New York. (Legitimate download.)
Hesterberg, T. C., 2015. What teachers should know about the bootstrap: Resampling in the undergraduate statistics curriculum. American Statistician, in press.
Quineche. R. & G. Rodríguez, 2015. Data-dependent methods for the lag selection in unit root tests with structural change. Documento de Trabajo No. 404, Departmento de Economía, Pontificia Universidad Católica del Perú.

Friday, October 16, 2015

New Forecasting Blog

Allan Gregory, at Queen's University (Canada) has just started a new blog that concentrates on economic forecasting. You can find it here.

In introducing his new blog, Allan says:

"The goal is to discuss, compare and even evaluate alternative methods and tools for forecasting economic activity in Canada. I hope others involved in the business of forecasting will share their work, opinions and so on in this forum. Hopefully, we can understand the interaction of forecasting theory and practical forecasting."

This is a blog that you should follow. I'm looking forward to Allan's upcoming posts.

Tuesday, October 13, 2015

Angus Deaton, Consumer Demand, & the Nobel Prize

I was delighted by yesterday's announcement that Angus Deaton has been awarded the Nobel Prize in Economic Science this year. His contributions have have been many, fundamental, and varied, and I certainly won't attempt to summarize them here. Suffice to say that the official citation says that the award is "for his contributions to consumption, poverty, and welfare".

In this earlier post I made brief mention of Deaton's path-breaking work, with John Muellbauer, that gave us the so-called "Almost Ideal Demand System".

The AIDS model took empirical consumer demand analysis to a new level. It facilitated more sophisticated, and less restrictive, econometric analysis of consumer demand behaviour than had been possible with earlier models. The latter included the fundamentally important Linear Expenditure System (Stone, 1954), and the Rotterdam Model (Barten, 1964; Theil, 1965).

I thought that readers may be interested in an empirical exercise with the AIDS model. Let's take a look at it.

Lies, Damned Lies, & Cointegration

My thanks to a colleague for bringing to my attention a recent discussion paper with the provocative title, "Why Most Published Results on Unit Root and Cointegration are False".

As you can imagine, I couldn't resist it!

After a quick read (and a couple of deep breaths), my first reaction was to break one of my self-imposed blogging rules, and pull the paper apart at the seams.

The trouble is, the paper is so outrageous in so many ways, that I just wasn't up to it. Instead, I'm going to assign it to students in my Time-Series Econometrics course to critique. They have more patience than I do!

The authors make sweeping claims that certain theoretical results are undermined by one poorly implemented piece of (their own) empiricism.

They provide no serious evidence that I could find to support the bold claim made in the title of their paper.

We are left with a concluding section containing remarks such as:

"In summary, three analogies between cointegration analysis and a sandcastle may be appropriate. First, a sandcastle may be built on sand, so it falls down because the foundation is not solid. Second, a sandcastle may be badly built. Third, a sandcastle built on seashore with a bad design may stay up but will not withstand the ebb and flow of the tides. The cointegration analysis, like a sandcastle, collapses on all three counts. In several planned research publications, we will report the criticism of research outcomes (results) and the methods employed to obtain such results. Below we provide one example why a research finding using the methodology of cointegration analysis to be false." (pp.11-12)

and:

"In the name of science, cointegration analysis has become a tool to justify falsehood -- something that few people believe to be true but is false. We recommend that except for a pedagogical review of a policy failure of historical magnitude, the method of cointegration analysis not be used in any public policy analysis." (p.14)

The most positive thing I can say is: I can't wait for the promised follow-up papers!

© 2015, David E. Giles

Sunday, October 4, 2015

Cointegration & Granger Causality

Today, I had a query from a reader of this blog regarding cointegration and Granger causality.

Essentially, the email said:

"I tested two economic time-series and found them to be cointegrated. However, when I then tested for Granger causality, there wasn't any. Am I doing something wrong?"

First of all, the facts:

If two time series, X and Y, are cointegrated, there must exist Granger causality either from X to Y, or from Y to X, both in both directions.
The presence of Granger causality in either or both directions between X and Y does not necessarily imply that the series will be cointegrated.

Now, what about the question that was raised?

Truthfully, not enough information has been supplied for anyone to give a definitive answer.

What is the sample size? Even if applied properly, tests for Granger non-causality have only asymptotic validity (unless you bootstrap the test).
How confident are you that the series are both I(1), and that you should be testing for cointegration in the first place?
What is the frequency of the data, and have they been seasonally adjusted? This can affect the unit root tests, cointegration test, and Granger causality test.
How did you test for cointegration - the Engle-Granger 2-step approach, or via Johansen's methodology?
How did you test for Granger non-causality? Did you use a modified Wald test, as in the Toda-Yamamoto approach?
Are there any structural breaks in either of the time-series? These ail likely any or all of the tests that you have performed.
Are you sure that you correctly specified the VAR model used for the causality testing, and the VAR model on which Johansen's tests are based (if you used his methodology to test for cointegration)?

The answers to some or all of these questions will contain the key to why you obtained an apparently illogical result.

Theoretical results in econometrics rely on assumptions/conditions that have to be satisfied. If they're not, then don't be surprised by the empirical results that you obtain.

Friday, October 2, 2015

Illustrating Spurious Regressions

I've talked a bit about spurious regressions a bit in some earlier posts (here and here). I was updating an example for my time-series course the other day, and I thought that some readers might find it useful.

Let's begin by reviewing what is usually meant when we talk about a "spurious regression".

In short, it arises when we have several non-stationary time-series variables, which are not cointegrated, and we regress one of these variables on the others.

In general, the result that we get are nonsensical, and the problem is only worsened if we increase the sample size. This phenomenon was observed by Granger and Newbold (1974), and others, and Phillips (1986) developed the asymptotic theory that he then used to prove that in a spurious regression the Durbin-Watson statistic converges in probability to zero; the OLS parameter estimators and R² converge to non-standard limiting distributions; and the t-ratios and F-statistic diverge in distribution, as T ↑ ∞ .

Let's look at some of these results associated with spurious regressions. We'll do so by means of a simple simulation experiment.

What NOT To Do When Data Are Missing

Here's something that's very tempting, but it's not a good idea.

Suppose that we want to estimate a regression model by OLS. We have a full sample of size n for the regressors, but one of the values for our dependent variable, y, isn't available. Rather than estimate the model using just the (n - 1) available data-points, you might think that it would be preferable to use all of the available data, and impute the missing value for y.

Fair enough, but what imputation method are you going to use?

For simplicity, and without any loss of generality, suppose that the model has a single regressor,

y_i = β x_i + ε_i , (1)

and it's the n^th value of y that's missing. We have values for x₁, x₂, ...., x_n; and for y₁, y₂, ...., y_n-1.

Here's a great idea! OLS will give us the Best Linear Predictor of y, so why don't we just estimate (1) by OLS, using the available (n - 1) sample values for x and y; use this model (and x_n) to get a predicted value (y*_n) for y_n; and then re-estimate the model with all n data-points: x₁, x₂, ...., x_n; y₁, y₂, ...., y_n-1, y*_n.

Unfortunately, this is actually a waste of time. Let's see why.

Reading List for October

Some suggestions for the coming month:

Franses, P. H., 2016. A note on the mean absolute scaled error. International Journal of Forecasting, 32, 20-22.
Gorroochurn, P., 2015. On Galton's change from 'reversion' to 'regression'. American Statistician, in press.
Holan, S. H., R. Lund, and G. Davis, 2010. The ARMA alphabet soup: A tour of ARMA model variants. Statistics Surveys, 4, 232-274.
Liu, S., H. Wu, and W. Q. Meeker, 2015. Understanding and addressing the unbounded "likelihood" problem. American Statistician, 69, 191-200.
McCracken, M. W. and S. Ng, 2015. FRED-MD: A monthly database for macroeconomic research. Journal of Business and Economic Statistics, in press.
Müller, U. K. and M. W. Watson, 2015. Low-frequency econometrics. NBER Working Paper No. 21564.

Friday, September 25, 2015

Thanks, Dan!

Quite out of the blue, Dan Getz kindly sent me a nice LaTeX version of my hand-written copy of "The Solution", given in my last post.

Dan used the ShareLaTeX site - https://www.sharelatex.com/ .

So, here's a nice pdf file with The Solution.

Thanks a million, Dan - that was most thoughtful of you!

© 2015, David E. Giles

Tuesday, September 22, 2015

The Solution

You can find a solution to the problem posed in yesterday's post here.

I hope you can read my writing!

p.s.: Dan Getz kindly supplied a LaTeX version - here's the pdf file. Thanks, Dan!

© 2015, David E. Giles

Monday, September 21, 2015

Try This Problem

Here's a little exercise for you to work on:

We know from the Gauss-Markhov Theorem that within the class of linear and unbiased estimators, the OLS estimator is most efficient. Because it is unbiased, it therefore has the smallest possible Mean Squared Error (MSE) within the linear and unbiased class of estimators. However, there are many linear estimators which, although biased, have a smaller MSE than the OLS estimator. You might then think of asking:

“Why don’t I try and find the linear estimator that has the smallest possible MSE?”

(a) Show that attempting to do this yields an “estimator” that can’t actually be used in practice.

(You can do this using the simple linear regression model without an intercept, although the result generalizes to the usual multiple linear regression model.)

(b) Now, for the simple regression model with no intercept,

y_i = β x_i + ε_i ; ε_i ~ i.i.d. [0 , σ²] ,

find the linear estimator, β* , that minimizes the quantity:

h[Var.(β*) / σ²] + (1 - h)[Bias(β*)/ β]² , for 0 < h < 1.

Is β* a legitimate estimator, in the sense that it can actually be applied in practice?

The answer will follow in a subsequent post.

Tuesday, September 1, 2015

September Reading List

Abeln, B. and J. P. A. M. Jacobs, 2015. Seasonal adjustment with and without revisions: A comparison of X-13ARIMA-SEATS and CAMPLET. CAMA Working Paper 25/2015, Crawford School of Public Policy, Australian National University.
Chan, J. C. C. and A. L. Grant, 2015. A Bayesian model comparison for trend-cycle decompositions of output. CAMA Working Paper 31/2015, Crawford School of Public Policy, Australian National University.
Chen, K. and K-S. Chan, 2015. A note on rank reduction in sparse multivariate regression. Journal of Statistical Theory and Practice, in press.
Fan, Y., S. Pastorello, and E. Renault, 2015. Maximization by parts in extremum estimation. Econometrics Journal, 18, 147-171.
Horowitz, J., 2014. Variable selection and estimation in high-dimensional models. Cemmap Working Paper CWP35/15, Institute of Fiscal Studies, Department of Economics, University College London.
Larson, W., 2015. Forecasting an aggregate in the presence of structural breaks in the disaggregates. RPF Working Paper No. 2015-002, Research Program on Forecasting, Center of Economic Research, George Washington University.

Wednesday, August 26, 2015

Biased Estimation of Marginal Effects

I began a recent post with the comment:

"One thing that a lot of practitioners seem to be unaware of (or they choose to ignore it) is that in many of the common situations where we use regression analysis to estimate elasticities, these estimators are biased.

And that's true even if all of the conditions needed for the coefficient estimator (e.g., OLS) to be unbiased are fully satisfied."

Exactly the same point can be made in respect of estimated marginal effects, and that's what this post is about.

The Distribution of a Ratio of Correlated Normals

Suppose that the random variables X₁ and X₂ are jointly distributed as bivariate Normal, with means of θ₁ and θ₂, variances of σ₁² and σ₂² respectively, and a correlation coefficient of ρ.

In this post we're going to be looking at the distribution of the ratio, W = (X₁ / X₂).

You probably know that if X₁ and X₂ are independent standard normal variables, then W follows a Cauchy distribution. This will emerge as a special case in what follows.

The more general case that we're concerned with is of interest to econometricians for several reasons.

The Bias of Certain Elasticity Estimators

In a recent post I discussed some aspects of estimating elasticities from regression models, and the interpretation of these values. That discussion should be kept in mind in reading what follows.

One thing that a lot of practitioners seem to be unaware of (or they choose to ignore it) is that in many of the common situations where we use regression analysis to estimate elasticities, these estimators are biased.

And that's true even if all of the conditions needed for the coefficient estimator (e.g., OLS) to be unbiased are fully satisfied.

Let's look at some common situations leading to the estimation of elasticities and marginal effects, and see if we can summarize what's going on.

Econometric Society World Congress

The Econometric Society holds a World Congress every five years. Right now, the 2015 Congress is taking place in Montréal, Canada.

Here's the full program. Enjoy!

Wednesday, August 12, 2015

Classic Data Visualizations

My thanks to Veronica Johnson at Investech.com for drawing my attention a recent piece of theirs relating to Classic Data Visualizations.

As they say:

"A single data visualization graphic can be priceless. It can save you hours of research. They’re easy to read, interpret, and, if based on the right sources, accurate, as well. And with the highly social nature of the web, the data can be lighthearted, fun and presented in so many different ways.

What’s most striking about data visualizations though is that they aren’t as modern a concept as we tend to think they are.

In fact, they go back to more than 2,500 years—before computers and tools for easy visual representation of data even existed."

Here are the eleven graphics that they highlight:

Symmetry and Skewness

After taking your first introductory course in statistics you probably agreed wholeheartedly with the following statement:

"A statistical distribution is symmetric if and only if it is not skewed."

After all, isn't that how we define "skewness"?

In fact, that statement is incorrect. There are distributions which have a skewness coefficient of zero, but are asymmetric.

Before considering some examples of this phenomenon, let's take a closer look at the meaning of "skewness" in the statistical context.

The H-P Filter and Unit Roots

The Hodrick-Prescott (H-P) filter is widely used for trend removal in economic time-series, and as a basis for business cycle analysis, etc. I've posted about the H-P filter before (e.g., here).

There's a widespread belief that application of the H-P filter will not only isolate the deterministic trend in a series, but it will also remove stochastic trends - i.e., unit roots. For instance, you'll often hear that if the H-P filter is applied to quarterly data, the filtered series will be stationary, even if the original series is integrated of order up to 4.

Is this really the case?

Let's take a look at two classic papers relating to this topic, and a very recent one that provides a bit of an upset.

Estimating Elasticities, All Over Again

I had some interesting email from Andrew a while back to do with computing elasticities from log-log regression models, and some related issues.

In his first email, Andrew commented:

"I am interested in the elasticity of H with respect to W, e.g., hours with respect to wages. For simplicity, assume that W is randomly assigned, and that the elasticity is identical for everyone.

Standard practice would be to regress log(H) on a constant and log(W). The coefficient on log(W) then seems to be the elasticity, as it estimates d log(H) / d log(W).

But changes in log( ) are only equal to changes in percent in the limit as the changes go to zero. In practice, one typically uses discrete data. Because the changes in W may be large, the resulting coefficient is just a first order approximation of the elasticity, and is not identical to the true elasticity."

Let's focus on the third paragraph. Keep in mind that log( ), here, refers to "natural" (base 'e') logarithms.

Andrew is quite correct, and this is something that we often overlook when teaching econometrics, or when interpreting someone's regression results. I sometimes refer students to this useful piece by Kenneth Benoit. Here's a key extract from p.4:

August reading

Here's my (slightly delayed) August reading list:

Ahelegbey, A. F., 2015. The econometrics of networks: A review. Working Paper 2015/13, Department of Economics, University of Venice.
Clemens, M. A., 2015. The meaning of failed replications: A review and proposal. IZA Discussion Paper No.9000.
Fair, R. C., 2015. Information limits of aggregate data. Discussion Paper No. 2011, Cowles Foundation, Yale University.
Phillips, P. C. B., 2015. Inference in near singular regression. Discussion Paper No. 2009, Cowles Foundation, Yale University.
Stock, J. H. and M. W. Watson, 2015. Core inflation and trend inflation. NBER Working Paper 21282.
Ullah, A. and X. Zhang, 2015. Grouped model averaging for finite sample size. Working paper, Department of Economics, University of California, Riverside.

Thursday, July 16, 2015

Questions About the Size and Power of a Test

Osman, a reader of this blog, sent a comment in relation to my recent post on the effects of temporal aggregation on t-tests, and the like. Rather than just bury it, with a short response, in the "Comments" section of that post, I thought I'd give it proper attention here.

The comment read as follows:

"Thank you for this illustrative example. My question is not exactly related to the subject of your post. As you illustrated, the finite sample properties of tests are studied by investigating the size and power properties. You reported size distortions to assess the size properties of the test. My first question is about the level of the size distortions. How much distortions is need to conclude that a test is useless? Is there an interval that we can construct around a nominal size value to gauge the significance of distortions? Same type of questions can also be relevant for the power properties. The “size adjusted power” is simply rejection rates obtained when the DGP satisfies an alternative hypothesis. Although, the power property is used to compare alternative tests, we can still ask question regarding to the level of the power. As your power curve shows, the level of power also depends on the parameter value assumed under the alternative hypothesis. For example, when β₁= 0.8 the power is around 80% which means that the false null is rejected 80 times out of 100 times. Again, the question is that what should be the level of the power to conclude that the test has good finite sample properties?"

Let's look at Osman's questions.

'Student', on Kurtosis

W. S. Gosset (Student) provided this useful aid to help us remember the difference between platykurtic and leptokurtic distributions:

('Student', 1927. Errors of routine analysis. Biometrika, 19, 151-164. See p. 160.)

Here, β₂ is the fourth standardized moment of the distribution about its mean. The Normal distribution has β₂ = 3.

The appropriate definition of "kurtosis" for uni-modal distributions has been the subject of considerable discussion in the statistical literature. Should it be based on the characteristics of the tail of the distribution; the shape of the density around its mode; or both?

Wednesday, July 8, 2015

Parallel Computing for Data Science

Hot off the press, Norman Matloff's book, Parallel Computing for Data Science: With Examples in R, C++ and CUDA (Chapman and Hall/ CRC Press, 2015) should appeal to a lot of the readers of this blog.

The book's coverage is clear from the following chapter titles:

1. Introduction to Parallel Processing in R
2. Performance Issues: General
3. Principles of Parallel Loop Scheduling
4. The Message Passing Paradigm
5. The Shared Memory Paradigm
6. Parallelism through Accelerator Chips
7. An Inherently Statistical Approach to Parallelization: Subset Methods
8. Distributed Computation
9. Parallel Sorting, Filtering and Prefix Scan
10. Parallel Linear Algebra
Appendix - Review of Matrix Algebra

The Preface makes it perfectly clear what this book is intended to be, and what it is not intended to be. Consider these passages:

July Reading

Now that the (Northern) summer is here, you should have plenty of time for reading. Here are some recommendations:

Ahelegbey, D. F., 2015. The econometrics of networks: A review. Working Paper 13/WP/2015, Department of Economics, University of Venice.
Camba-Mendez, G., G. Kapetanios, F. Papailias, and M. R. Weale, 2015. An automatic leading indicator, variable reduction and variable selection methods using small and large datasets: Forecasting the industrial production growth for Euro area economies. Working Paper No. 1773, European Central Bank.
Cho, J. S., T-H. Kim, and Y. Shin, 2015. Quantile cointegration in the autoregressive distributed-lag modeling framework. Journal of Econometrics, 188, 281-300.
De Luca, G., J. R. Magnus, and F. Peracchi, 2015. On the ambiguous consequences of omitting variables. EIEF Working Paper 05/15.
Gozgor, G., 2015. Causal relation between economic growth and domestic credit in the economic globalization: Evidence from the Hatemi-J's test. Journal of International Trade and Economic Development, 24, 395-408.
Panhans, M. T. and J. D. Singleton, 2015. The empirical economist's toolkit: From models to methods. Working Paper 2015-03, Center for the History of Political Economy.
Sanderson. E and F. Windmeijer, 2015. A weak instrument F-test in linear IV models with multiple endogenous variables. Discussion Paper 15/644, Department of Economics, University of Bristol.

Monday, June 29, 2015

The Econometrics of Temporal Aggregation - VI - Tests of Linear Restrictions

This post is one of several related posts. The previous ones can be found here, here, here, here and here. These posts are based on Giles (2014).

Many of the statistical tests that we perform routinely in econometrics can be affected by the level of aggregation of the data. Here, Let's focus on time-series data, and with temporal aggregation. I'm going to show you some preliminary results from work that I have in progress with Ryan Godwin. These results relate to one particular test, but work covers a variety of testing problems.

I'm not supplying the EViews program code that was used to obtain the results below - at least, not for the moment. That's because the results that I'm reporting are based on work in progress. Sorry!

As in the earlier posts, let's suppose that the aggregation is over "m" high-frequency periods. A lower case symbol will represent a high-frequency observation on a variable of interest; and an upper-case symbol will denote the aggregated series.

So,

Y_t = y_t + y_{t - 1} + ......+ y_{t - m + 1} .

If we're aggregating monthly (flow) data to quarterly data, then m = 3. In the case of aggregation from quarterly to annual data, m = 4, etc.

Now, let's investigate how such aggregation affects the performance of standard tests of linear restrictions on the coefficients of an OLS regression model. The simplest example would be a t-test of the hypothesis that one of the coefficients is zero. Another example would be the F-test of the hypothesis that all of the "slope" coefficients in such a regression model are zero.

Consider the following simple Monte Carlo experiment, based on 20,000 replications.

"Readers' Forum" Page

I've added a new page to this blog - it's titled "Readers' Forum". You'll see the tab for it in the bar just above the top post on the page you're reading now.

Some explanation is in order.

What?

The Readers' Forum is intended to be a "clearing house" for questions and requests relating to econometrics'

As I say in the preamble to the new page,

"Please feel free to use the "Comment" facility below to provide questions and answers relating to Econometrics. I won't be able to answer all questions myself, but other readers may be able to help. The Forum will be "lightly moderated" to avoid spam and inappropriate content."

Please note the italicized passage above.

Why?

Every day, readers post (or attempt to post) lots of "comments" on the various posts on this blog. In many cases, these are not comments at all - they're questions, requests for assistance, and the like.

All comments are moderated - I get to give them the "O.K." before they actually appear. That's just fine for genuine comments. Unless they're spam or contain inappropriate content, I invariably "approve" them right away.

However, in the case of questions and requests I prefer to delay approval and post a response simultaneously. Regrettably, this has meant that, increasingly, there is often a delay in getting this out there.

Sometimes, the requests are, quite frankly, unreasonable.I won't go into the details here, but let's just say that there's a difference between a blog and a free consulting service.

I also try to be very careful when it's clear that the request comes form a student. I certainly don't want to get into a situation where I'm "getting between" that student and their instructor/supervisor.

In short, I like to try and be helpful, but I can't keep up with the demand. I do have a job!

Hopefully, the new forum will free up some time for me to focus on substantive posts, while still providing an opportunity for discussion.

Friday, June 12, 2015

Econometrics Videos

The Royal Economics Society (publisher of The Econometrics Journal) has recently released a video of invited addresses by Alfred Galichon and Jeremy Lise, in the special session on “Econometrics of Matching” at the 2015 RES Conference.

This video joins similar ones from previous RES conferences, these being:

“Large Dimensional Models”,
”Heterogeneity”,
“Econometrics of Forecasting”,
“Nonparametric Identification”

This link will take you to all of these videos.

Happy viewing!

Specification Testing in the Ordered Probit Model

Readers of this blog will know I'm a proponent of more specification testing in the context of Logit, Probit, and related models. For, instance, see this recent post, and the links within it.

I received an email from Paul Johnson, Chair of the Department of Economics at Vassar College, today. He wrote:

"I thought that, given your interest in specification tests in probit etc. models, you might find the attached paper of mine (written some years ago) to be useful as it expounds the straightforward generalization of the Bera, et al. (1984) test to the ordered probit case."

Paul's paper is, indeed, very interesting and I hadn't seen it before. It's titled, "A Test of the Normality Assumption in the Ordered Probit Model", and it appeared in the statistics journal, Metron (1996, LIV, 213-221).

Here's the abstract:

"This paper presents an easily implemented test of the assumption of a normally distributed error term for the ordered probit model, As this assumption is the central maintained hypothesis in all estimation and testing based on this model, the test ought to serve as a key specification test in applied research. A small Monte Carlo experiment suggests that the test has good size and power properties."

A year later, a closely related paper by P. Glewwe appeared in Econometric Reviews. That author doesn't mention Paul's paper, but these things happen. Glewwe's paper does take things a little further than Paul does, by allowing for censoring of the data.

Tuesday, June 9, 2015

Worrying About my Cholesterol Level

The headline, "Don't Get Wrong Idea About Cholesterol", caught my attention in the 3 May, 2015 Times-Colonist newspaper here in Victoria, B.C.. In fact the article came from a syndicated column, published about a week earlier. No matter - it's always a good time for me to worry about my cholesterol!

The piece was written by a certain Dr. Gifford-Jones (AKA Dr. Ken Walker).

Here's part of what he had to say:

Statistical Calculations & Numerical Accuracy

This post is for those readers who're getting involved with economic statistics for the first time. Basically, it serves as a warning that sometimes the formulae that you learn about have to be treated with care when it comes to the actual numerical implementation.

Sometimes (often) there's more than one way to express the formula for some statistic. While thee formulae may be mathematically identical, they can yield different numerical results when you go to apply them. Yes, this sounds counter-intuitive, but it's true. And it's all to do with the numerical precision that your calculator (computer) is capable of.

The example I'll give is a really simple one. However, the lesson carries over to more interesting situations. For instance, the inversion of matrices that we encounter when applying the OLS estimator is a case in point. When you fit a regression model using several different statistics/econometrics computer packages, you sometimes get slightly different results. This is because the packages can use different numerical methods to implement the algebraic results that you're familiar with.

For me, the difference between a "good" package and a "not so good" package isn't so much the range of fancy techniques that each offers at the press of a few keys. It's more to do with what's going on "under the hood". Do the people writing the code know how make that code (a) numerically accurate; and (b) numerically robust to different data scenarios?

Logit, Probit, & Heteroskedasticity

I've blogged previously about specification testing in the context of Logit and Probit models. For instance, see here and here.

Testing for homoskedasticity in these models is especially important, for reasons that are outlined in those earlier posts. I won't repeat all of the details here, but I'll just note that heteroskedasticity renders the MLE of the parameters inconsistent. (This stands in contrast to the situation in, say, the linear regression model where the MLE of the parameters is inefficient, but still consistent in this case.)

If you're an EViews user, you can find my code for implementing a range of specification tests for Logit and Probit models here. These include the LM test for homoskedasticity that was proposed by Davidson and MacKinnon (1984).

More than once, I've been asked the following question:

"When estimating a Logit or Probit model, we set the scale parameter (variance) of the error term to the value one, because it's not actually identifiable. So, in what sense can we have heteroskedasticity in such models?"

This is a good question, and I thought that a short post would be justified. Let's take a look:

June Reading List

Andrews, I. and T. B. Armstrong, 2015. Unbiased instrumental variables estimation under known first-stage sign. Cowles Foundation Discussion Paper No. 1984R, Yale University,
Bajari, P., D. Nekipelov, S. P. Ryan, and M. Yang. 2015. Demand estimation with machine learning and model combination. NBER Working Paper 20955.
Chambers, M. J., 2015. A jackknife correction to a test for cointegration rank. Econometrics, 3, 355-375.
Mazeu, J. H. G., E. Ruiz and H. Veiga, 2015. Model uncertainty and the forecast accuracy of ARMA models: A survey. UC3M Working Paper 15-08, Statistics and Econometrics, Universidad Carlos III de Madrid.
Paldam, M., 2015. Meta-analysis in a nutshell: Techniques and general findings. Economics, 9, 2015-11.
Triacca, U., 2015. A pitfall in using the characterization of Granger non-causality in vector autoregressive models. Econometrics, 3, 233-239.

Thursday, May 28, 2015

The Replication Network

Some of my previous posts have dealt with the issue of replicability - e.g., here and here.

I had an email from Bob Reed today, alerting me to his involvement in the launching of The Replication Network. I knew that this was a big interest of Bob's, and it's great to see where things are going with this important venture.

Here's Bob's email:

"I would like to invite you to consider joining The Replication Network (TRN). TRN is a website/community of scholars committed to promoting the use of replications in economics. For those of you who have seen an earlier version of the website, please note that the site has been thoroughly revamped in preparation for its “rollout” in June 2015. I think you will like what you see. You can check it out at :http://replicationnetwork.com/. For researchers interested in publishing replication studies, the website provides up-to-date information about (i) which journals encourage replication submissions, and (ii) a list of recently published replication studies that can serve as models of good practice. It also lists News and Events related to replications. Membership has its privileges. :-) Members can post their research on the website and are invited to contribute guest blogs related to replications. Members will also receive email updates every few months alerting them to new content on the website. There is no financial cost to join TRN, you will not be bothered with advertising or other spam, and you can easily unsubscribe if you later choose to. The main reason I encourage you to join TRN is because there is strength in numbers. If journals can see that there is a substantial readership of academics who support replication research and are interested in seeing it promoted, they may choose to change their editorial policies to include the publishing of replication studies. The lack of publishing outlets is the single biggest obstacle to the wider use of replication in economics.

Thank you for giving this some thought. And feel free to forward this email to your colleagues.

If you have any questions do not hesitate to contact me."

Sunday, May 24, 2015

John Nash, 1928 - 2015

John & Alicia Nash

Tragically, John and Alicia Nash died as the result of a road accident on New Jersey yesterday.

Just days previously, Nash was the co-recipient of the 2015 Abel Prize for his contributions to the theory of nonlinear partial differential equations. He was the only person to be awarded both the Abel Prize and a Nobel Prize (Economics, 1994).

It would be impertinent of me to try and comment meaningfully, here, on Nash's contributions to Game Theory.

I was, however, taken by one comment on Twitter this morning:

Ferdinando is referring to Nash's Ph.D. dissertation, "Non-Cooperative Games", completed at Princeton University in May of 1950. Yes, it was just 27 pages long. One of the only two references was to von Neumann and Morgenstern's classic 1944 book, Theory of Games and Economic Behavior. The other was to Nash's own paper, published in the Proceedings of the National Academy of Sciences in 1950. It spanned just two pages, but was actually less than one page long!

Yes, sometimes it really is the case that, "Less is more". (Ludwig Mies van der Rohe)

(Suggested reading: "How Long Should My Thesis Be?".)

Friday, May 22, 2015

Maximum Likelihood Estimation & Inequality Constraints

This post is prompted by a question raised by Irfan, one of this blog's readers, in some email correspondence with me a while back.

The question was to do with imposing inequality constraints on the parameter estimates when applying maximum likelihood estimation (MLE). This is something that I always discuss briefly in my graduate econometrics course, and I thought that it might be of interest to a wider audience.

Here's the issue.

A Pleasant Surprise

I could scarcely believe my good fortune when I opened the following email earlier today:

Dear Dr Giles,

Congratulations, your paper “Being ‘in’ assessment: The ontological layer(ing) of assessment practice” published in Journal of Applied Research in Higher Education has been selected by the journal’s editorial team as the Outstanding Paper of 2014. We aim to increase dissemination of such a high quality article as much as possible and aim to promote your paper by making it freely available for one year. I will confirm once the free access has gone live so that you will be able to let others know. This will be in the next couple of weeks. As a winner you will receive a certificate. Where possible, we like to organise for you to be presented with your certificate in person. We will be in touch with you shortly (next few weeks) in the hope that we can organize a presentation. Again, many congratulations on your award. We will be in touch with you regarding our plans to promote and present your award very soon. Please do not respond to this mail asking about when your chapter will be made freely available or about possible presentations. I will be in touch soon with all the details!!

Best regards,
Jim Bowden

Academic Relations Manager | Emerald Group Publishing Limited

Tel: +44 (0)1274 785013 | Fax: +44 (0)1274 785200

Now, I must confess that I had some difficulty recalling the details of what I'd written to deserve this unexpected honour. But it must have been pretty darned good!

So, to refresh my memory I took a quick look at the 2014 volume of the journal in question. (Bookmark this link for future free access!)

Sure enough, there I it was - not the lead article, but close enough (apparently):

For reasons that I simply can't fathom, our library doesn't subscribe to this journal. For reasons that I definitely can fathom, neither do I! The promised free access has not yet "gone live", so I'll have to spare you the pleasure of a replication of the full text of the article here.

However, by way of recompense, here's the abstract:

Abstract:

Purpose

– Current discourses on educational assessment focus on the priority of learning. While this intent is invariably played out in classroom practice, a consideration of the ontological nature of assessment practice opens understandings which show the experiential nature of “being in assessment”. The purpose of this paper is to discuss these issues.

Design/methodology/approach

– Using interpretive and hermeneutic analyses within a phenomenological inquiry, experiential accounts of the nature of assessment are worked for their emergent and ontological themes.

Findings

– These stories show the ontological nature of assessment as a matter of being in assessment in an embodied and holistic way.

Originality/value

– Importantly, the nature of a teacher's way-of-being matters to assessment practices. Implications exist for teacher educators and teacher education programmes in relation to the priority of experiential stories for understanding assessment practice, the need for re-balancing a concern for professional knowledge and practice with a students’ way of being in assessment, and the pedagogical implications of evoking sensitivities in assessment.

I can hardly believe I wrote that!

(I have replied to Mr. Bowden, expressing my gratitude but suggesting that a certain David Giles in the School of Education at Flinders University in Australia may be even more pleased to hear from him than I was.)

Pages

Wednesday, December 30, 2015

Tuesday, December 29, 2015

Monday, December 28, 2015

Sunday, December 27, 2015

Saturday, December 26, 2015

Tuesday, December 22, 2015

Wednesday, December 9, 2015

Friday, December 4, 2015

Sunday, November 15, 2015

Friday, October 16, 2015

Tuesday, October 13, 2015

Sunday, October 11, 2015

Sunday, October 4, 2015

Friday, October 2, 2015

Thursday, October 1, 2015

Wednesday, September 30, 2015

Friday, September 25, 2015

Tuesday, September 22, 2015

Monday, September 21, 2015

Tuesday, September 1, 2015

Wednesday, August 26, 2015

Tuesday, August 25, 2015

Monday, August 24, 2015

Thursday, August 20, 2015

Wednesday, August 12, 2015

Tuesday, August 11, 2015

Friday, August 7, 2015

Thursday, August 6, 2015

Tuesday, August 4, 2015

Thursday, July 16, 2015

Thursday, July 9, 2015

Wednesday, July 8, 2015

Tuesday, June 30, 2015

Monday, June 29, 2015

Monday, June 22, 2015

Friday, June 12, 2015

Tuesday, June 9, 2015

Saturday, June 6, 2015

Thursday, June 4, 2015

Tuesday, June 2, 2015

Thursday, May 28, 2015

Sunday, May 24, 2015

Friday, May 22, 2015

Wednesday, May 20, 2015