This memo investigates whether the choice between simultaneous and sequential modeling affects prediction accuracy when estimating transition probabilities for housing choices. Specifically, models for the choice of housing type and housing tenure are compared, estimated either as a simultaneous distribution or using sequential models that model housing type followed by housing tenure conditional on the housing type (or vice versa). All models are estimated using LightGBM on individual-level data from Statistics Denmark, and the comparison is based on out-of-sample log-loss as well as visual assessments of marginal distributions from the model and simultaneous distributions from simulations.
The results show that prediction accuracy is virtually identical across models, with log-loss in the range of 1.77–1.78 and no systematic differences in the estimated or simulated distributions. On the other hand, there are significant differences in computation time: the simultaneous model is considerably more time-consuming in both training and prediction than the sequential models. For instance, the simultaneous model takes 4–5 times longer to predict housing type and housing choice than if they were predicted from a sequential model. Overall, no empirical evidence is found to suggest that simultaneous modeling yields better predictions in this context, while sequential models emerge as a more computationally efficient alternative.