Sorry, but your number of time points is way too big for xtdpdml to handle, especially with more periods than companies. I am running analyses on a panel dataset with country-quarterly data. I am thinking of using xtgee with f1. DV one year forward so that every variable is in the previous year, family nbinomial link corr ar 1.
My question is: considering that I am using corr ar1 , is it correct to include the lag of the DV as control? Thank you for your prompt reply. In trying to solve the above problem I have come across your paper on xtdpdml. I get the same error even if I have a simple code such as xtset countryid timeid xtdpdml DV IV where both are time varying. Rename it. What is the case about a static panel data model with a lagged independent variable. By definition this is not a dynamic panel.
For example a previous decision will influence the present decision and so on. Does a lagged independent variable also violates strict exogeneity? Why and how? This is an incredibly illuminating post, with implications for a large amount of research in my field epidemiology. The primary effect of interest is the association between the exposure and outcomes measured at the same time.
I have attempted to implement your method in R using the dpm:: package that replicates your xtdpdml as a front-end to lavaan:: SEM model below this comment. Given the potential for your methods in medical research, it would be great if there was a straight-forward way of producing easily interpreted effects for categorical outcomes, such as RRs, marginal probabilities, etc.
It is sensible to talk about marginal effects from regressions fitted in SEM? If the answer to 1. However, to my knowledge, pooled OLS and random effects models are required that the composite error terms is uncorrelated with the vector of regressors, but not for the -fe- specification, where the panel-wise error is actually correlated with the vector of regressors.
Wooldridge explains it in his textbook on panel data. In this case I assume lagged response variables are less of an issue since our ultimate aim is to minimize MSE regardless of bias in coefficient estimates. Allison, does the caution about lagged dependent variables apply to count models using within-between with the menbreg command in Stata? If so, what would I use instead of menbreg?
Thank you for your help. Yes, you should not put a lagged dependent variable in an menbreg model, even for the between-within method. Do you have the estimation method that you describe using the gsem command written anywhere, since I am not able to take your course at this time?
Also, in my situation count data using within-between and menbreg , would it work to make all the independent variables leading one year prior, for example and keeping the dependent variable without a lead or lag?
As for your proposed method, I doubt that this would appropriately solve the problem. Thanks to you, Enrique, and Rich for working on this and implementing it in Stata, Paul. Question: What about the two-time point situation where the nesting factor is not the individual but the group they belong to e. Most people would think of this as a type of cross-sectional multilevel model but since they have a pre score at t1 they want to control for it. Does your advice to not include a lagged DV in the model hold for this scenario?
Yes, you need more waves. Paul, Thanks for this post. So this should make all multilevel models problematic if there are predictors at Level 1. Am I understanding you correctly? Well, yes, u i could be affecting many time-varying predictors. But u i is necessarily correlated with yi t So we know for sure that the model violates a key assumption. I found it really helpful for understanding why to include groups means at level 2 and group-center whenever sensible at level 1.
It has to do with u i among other things. I have one of my own. In the late stages of the review process, we have been involved in discussions directly with the editors with regard to the model.
To cut short, our previous models were rightly considered as biased because of the lagged DepVar. We are leaning towards the use of xtdpdml, which would address the problem.
However, our DepVar is expressed in fractional terms, insofar it varies between 0 and Would xtdpdml still be appropriate, or what would you consider as a best alternative? Dear Paul, Indeed, my bad- I should have realised that before. Paul, thanks you for the post and for your continued effort to educate.
I have been facing an exact same issue on the use of a lagged predictor on panel data. If not, is the source code available? Paul, thanks for your informative page.
Finding this post a while back set me up on a path to begin to model our longitudinal data in much more responsible and sophisticated ways. That a lack of such independence biases parameter estimates in characteristic ways, typically inflating the non-independent fixed effects while deflating others. One other related thing, if I may. What do you think would be a suitable diagnostic test for the presence of such non-independence in an already-fitted model? Could you, say, extract the coefficients for an estimated random effect and look for remaining correlations against fixed-effects variables also in the model?
Thank you very much for this insightful information and the command! Would it be possible to do a system equation seemingly unrelated regression with FE or RE and a lagged dependent variable?
I guess that the SEM command will become extremely long and it will be computationally very difficult to solve, or is there hope? Best, Christian. Kind Regards. Thank you for this insightful post. I am trying to run a random effects model with xtdpdml. I have weekly data for just one year for week number 7 to 23 each month with gaps, the number of groups are I lowered the case of all the variable names in the dataset but that did not solve the issue.
I believe, the code is trying to find week 6 data. Can you please help? Unfortunately, xtdpdml cannot handle data with gaps in the time variable. That variable needs to have values 1,2,3,4,…, etc.
So try recoding the data with no gaps. But if you do have lagged variables, then some lags will apply to longer intervals between gaps and others to shorter. It would be implausible that the lagged effects would be the same for different intervals between observations. An ad hoc solution would be to add interactions between the lagged predictor and a variable capturing the length of the gap. Subscribe to RSS Feed. Course Materials Please fill out the form below to download sample course materials.
This field is for validation purposes and should be left unchanged. Here is the output: lwage Coef. To estimate a model for the wage data with xtdpdml , use xtset id t xtdpdml lwage, inv ed fem blk errorinv The inv option is for time-invariant variables. Here is the output: Coef. References Allison, Paul D. Comments Adriaan Hoogendoorn says:.
June 18, at am. Paul Allison says:. June 25, at am. Michael Zyphur says:. June 22, at pm. June 23, at pm. Paul von Hippel says:. June 30, at am. July 4, at pm. July 5, at pm. July 7, at am. August 19, at pm. August 24, at am. July 1, at am. Matt says:. September 17, at am. September 25, at am. Micah says:. October 12, at am. November 19, at pm. Daniel says:.
October 13, at am. January 18, at pm. January 25, at am. Andreas Kam says:. February 3, at am. February 8, at am. Tmapp says:. February 22, at am. March 2, at am. Mauricio says:. Why do simple time series models sometimes outperform regression models fitted to nonstationary data? Remember that if X and Y are nonstationary , this means that we cannot necessarily assume their statistical properties such as their correlations with each other are constant over time.
In other words, recent values of Y might be good "proxies" not only for the effect of X but also for the effects of any omitted variables. How to get the best of both worlds--regression and time series models:. Stationarize the variables by differencing, logging, deflating, or whatever before fitting a regression model.
Notice that this brings the prior values of both X and Y into the prediction. Improve this question. Maximizing R-squared is rarely a good model-selection criteria. It is true that although the R-squared differ a lot, the predicted values are actually the same using Y or the change of Y.
However, given the low R-squared value using the change of Y as DV, does it mean that the current set of IVs is not able to explain the change very well and there must be some omitted variables?
Add a comment. Active Oldest Votes. Improve this answer. Antoine Vernet Antoine Vernet 1, 16 16 silver badges 24 24 bronze badges. Tony Ladson Tony Ladson 7 7 silver badges 14 14 bronze badges. Nick Stauner Economerics Economerics 41 1 1 bronze badge.
I obtained data by digitising a plot which meant the data were sorted. This sorting and the non-linear relationship caused autocorrelation in the residuals. References: Kendall, Maurice G. Ian Barnett Ian Barnett 46 2 2 bronze badges. Please, give us more details so we may know better on what kind of model we are talking about.
Astur R. Astur 1, 1 1 gold badge 9 9 silver badges 14 14 bronze badges. Featured on Meta. Now live: A fully responsive profile. Linked
0コメント