Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tax function estimation #96

Open
jdebacker opened this issue Mar 11, 2024 · 6 comments
Open

Tax function estimation #96

jdebacker opened this issue Mar 11, 2024 · 6 comments

Comments

@jdebacker
Copy link
Member

Beginning with PR #73, which updated the default calibration of OG-USA, we have observed some odd results related to the estimated tax functions. This issue will document what we've noticed in the hopes that we can address any issues with the tax function estimation routines or with the microsimulation model used to calibrate OG-USA (or both).

Things that haven't seemed quite right:

  • In PR Update calibration #73, I noted that using DEP tax functions estimated using the most recent Tax-Calculator at the time (v 3.4.1) resulted in tax function parameters that, when used in OG-USA, resulted in an inability for the model SS to solve.
  • Also, noted in PR Update calibration #73, when trying to estimate the mono and mono2D functional form for the tax functions, there were failures in the estimation (e.g., no minimum found) (again, using Tax-Calculator 3.4.1)
  • In OG-USA simulations since October 2023, we've used GS functional forms for the tax functions (with these, the model solve), but we've noticed significant garbage collection and reductions in computational performance when solving the model (noted in OG-USA Discussions Analysis of Dask distributed workloads #83). Times to solve the model SS have gone up from about 45 seconds to 15 minutes. Note that when using the tax functions parameters in ogusa_default_parameters.json, the warnings and performance reductions pretty much disappear.
@jdebacker
Copy link
Member Author

Some plots:

DEP functions estimated on Tax-Calculator 3.4.1 (each line is a different age- blues for younger, red for older):
DEP_new

GS functions estimated on Tax-Calculator 3.5.1:
GS_new

@jdebacker
Copy link
Member Author

Some key questions:

  1. Are these odd functions and artifact of the microsimulation model output or txfunc.py (there have been changes to both)?
  2. Are these functions "correct" (i.e., do they fit the data best)?
  3. What is preventing estimation of the mono and mono2D functions?

@jdebacker
Copy link
Member Author

jdebacker commented Mar 11, 2024

Re (2) above, I don't see how these could be the best fit (albeit, the scatter plot dots do not reflect sampling weights):

ETRs for 40 year olds, DEP functions (tax year 2024):
DEP_age40

MTR on labor income for 40 year olds, DEP functions (tax year 2024):
DEP_age40_mtrx

@jdebacker
Copy link
Member Author

I've started looking into the estimation of the tax functions. Some questions I have:

  1. Does the numerical optimization method in our minimization of the non-linear least squares estimator matter for our estimates? In particular, as seen above, it is not uncommon to see the fitted functions and see clear room for a different parameterization to fit better than what is returned from the optimizer.
  2. Are we using the good starting values in our optimization? Do "better" starting value help reduce variation across age?
  3. Is it particularly difficult to estimate MTRs since they display much more variation than ETRs? And if so, is it better to infer the MTRs from the ETRs? But then, how much of the variation in the data are we missing?

@jdebacker
Copy link
Member Author

Re the method of numerical optimization, I'm seeing significant differences across the numerical algorithm used to minimize the nonlinear least squares function. Here are the tax functions for each age estimated using a few different algorithms:

DEP functional form:

L-BFGS-B method

CPS data

CPS_BFGS_mtrx

PUF data

PUF_BFGS_mtrx

SLSQP method

CPS data

DEP_CPS_SLSQP_mtrx

PUF data

DEP_PUF_SLSQP_mtrx

Nelder-Mead method

CPS data

CPS_NM_mtrx

PUF data

PUF_NM_mtrx

GS functional form:

L-BFGS-B method

CPS data

GS_CPS_mtrx

PUF data

GS_PUF_BFGS_mtrx

SLSQP

CPS data

GS_CPS_SLSQP_mtrx

PUF data

GS_PUF_SLSQP_mtrx

Nelder-Mead

CPS data

GS_CPS_NM_mtrx

PUF data

GS_PUF_NM_mtrx

Summary:

  • Quite of bit of variation across datasets (CPS vs PUF) and age -- both suggest that the estimates are very sensitive to the underlying data because there's shouldn't be that much variation in tax rates across the two data files (but we can confirm this).
  • The variation across methods to minimize the statistical objective function also suggests parameter estimates that are very sensitive to initial values and algorithms and therefore probably not precisely estimated.

@jdebacker
Copy link
Member Author

ETR function estimation

The above plots are of MTRs on labor income. ETRs seem to be more consistently estimated:

DEP

CPS

DEP_CPS_BFGS_ETR

PUF

DEP_PUF_BFGS_ETR

GS

CPS

GS_CPS_BFGS_etr

PUF

GS_PUF_BFGS_etr

GS, Nelder-Mead, PUF

GS_PUF_NM_etr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant