-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demonstrate parameter and constraint recovery #70
Comments
FWIW, this is what I had put together previously, @jeff-dotson. However, all of these rely on the use of an ensemble rather than a single model that knows the simulated pathologies exactly. |
A single model to establish the upper bounds confirms what's demonstrated above (at least for homogeneous pathologies; not sure how to get our present setup to manage heterogeneous pathologies with a single model), @jeff-dotson. We should review the big three:
|
The following updated results uses averaged respondent-level If we were only concerned with homogeneous pathologies, we could constrain the
The previous results also appeared to show that the models were increasingly accurate as the underlying pathologies became more complicated. Fortunately these latest results appear to avoid what may have been a coding issue or just an artifact of data simulation. |
Here I try to replicate the conjoint ensemble proof-of-concept with the constraints operating post-hoc in the generated quantities block.
|
Very interesting. Thanks for the update.
From: Marc Dotson ***@***.***>
Date: Tuesday, October 31, 2023 at 5:42 PM
To: marcdotson/conjoint-ensembles ***@***.***>
Cc: ***@***.*** # ***@***.***>, Mention ***@***.***>
Subject: Re: [marcdotson/conjoint-ensembles] Demonstrate parameter and constraint recovery (#70)
The following updated results uses averaged respondent-level Betas rather than the mean of the heterogeneity distribution Gammas when predicting out-of-sample hit rates and hit probability. This is a fairer comparison between the constrained-parameters method we're employing for the conjoint ensemble and the standard HMNL since we are imposing the constraints on the Betas rather than the Gammas.
If we were only concerned with homogeneous pathologies, we could constrain the Gammas directly. However, the bite starts with including heterogeneous pathologies, as demonstrated below. To alleviate any concerns about this choice, we could include predictions based on both the Betas and the Gammas. This might further demonstrate the limits of flexibility for the standard HMNL, especially for predicting out-of-sample.
Model Pathologies Heterogeneous LOO `Hit Rate` `Hit Prob`
<chr> <chr> <chr> <lgl> <dbl> <dbl>
1 HMNL None No NA 0.582 0.484
2 Ensemble Upper Bound None No NA 0.578 0.482
3 HMNL ANA No NA 0.45 0.384
4 Ensemble Upper Bound ANA No NA 0.457 0.383
5 HMNL Screen No NA 0.875 0.862
6 Ensemble Upper Bound Screen No NA 0.867 0.854
7 HMNL ANA & Screen No NA 0.911 0.892
8 Ensemble Upper Bound ANA & Screen No NA 0.91 0.881
9 HMNL None Yes NA 0.582 0.484
10 Ensemble Upper Bound None Yes NA 0.578 0.482
11 HMNL ANA Yes NA 0.41 0.369
12 Ensemble Upper Bound ANA Yes NA 0.421 0.372
13 HMNL Screen Yes NA 0.546 0.525
14 Ensemble Upper Bound Screen Yes NA 0.556 0.552
15 HMNL ANA & Screen Yes NA 0.594 0.551
16 Ensemble Upper Bound ANA & Screen Yes NA 0.569 0.566
The previous results also appeared to show that the models were increasingly accurate as the underlying pathologies became more complicated. Fortunately these latest results appear to avoid what may have been a coding issue or just an artifact of data simulation.
—
Reply to this email directly, view it on GitHub<#70 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABW2EBVANZ7M4EBVM3RQ4QLYCGEHPAVCNFSM47XQGGFKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCNZYHAYTQMRSGUZA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@ethanbudge here are the file descriptions for the original conjoint ensemble archive:
I've gone through the code. It's very clean. A few things I found, and I'm interested to see what you find:
|
@jeff-dotson I can't replicate the previous results. |
Interesting. Let’s discuss.
Jeff
From: Marc Dotson ***@***.***>
Date: Tuesday, December 19, 2023 at 1:27 PM
To: marcdotson/conjoint-ensembles ***@***.***>
Cc: ***@***.*** # ***@***.***>, Mention ***@***.***>
Subject: Re: [marcdotson/conjoint-ensembles] Demonstrate parameter and constraint recovery (#70)
I tried to replicate using information leakage we saw in the initial results. Still not there.
image.png (view on web)<https://github.com/marcdotson/conjoint-ensembles/assets/29615257/87f66858-6727-4840-b6ee-fa182e6406b9>
—
Reply to this email directly, view it on GitHub<#70 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABW2EBXST7MF7Q6N6LU4BQTYKH2BRAVCNFSM47XQGGFKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBWGM2DENZWGQ2A>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Before returning to more meta-learner alternatives, Jeff remembered an idea we've discussed previously, running the model using the actual constraint matrices used to produce the simulated data via
simulation-experiments
. This potentially allows us to see the best possible performance for any ensemble. It might help us figure out any problems with how we're implementing pathologies or with the estimation itself.No Pathologies
Let's start with no pathologies. Without constraints, this is purely parameter recovery:
It appears that the HMNL and the ensemble are able to recover parameters. This is a good sign that things are working as intended.
ANA Only
ANA only with a homogeneous pathology:
This seems to be our initial problem. We aren't recovering parameters as well and the ensemble, even though it knows the actual constraints, is only doing as well as the HMNL. The fact that the HMNL is doing just as well suggests that there simply isn't really any benefit to using the ensemble. In other words, the pathology isn't present (enough) or the HMNL is able to effectively account for ANA on its own as a homogeneous pathology.
ANA only with a heterogeneous pathology:
Screening Only
Screening only with a homogeneous pathology:
Note the scale: the HMNL is way off compared the ensembles (and it is parameters 1 and 3 that are being screened on homogeneously). In other words, screening is a pathology that the HMNL can't account for as well as ANA. That said, the ensemble fit statistics don't perform any better. This takes us back to how we actually compute the ensemble hit rates and hit probabilities.
Screening only with a heterogeneous pathology:
Respondent Quality Only
Respondent quality only with a homogeneous pathology:
Respondent quality isn't tied to specific attribute levels, so at the very least it's good to see that each of the parameter estimates struggles.
Respondent quality only with a heterogeneous pathology:
ANA and Screening
ANA and screening with homogeneous pathologies:
ANA and screening with heterogeneous pathologies:
Screening and Respondent Quality
Screening and respondent quality with homogeneous pathologies:
Screening and respondent quality with heterogeneous pathologies:
ANA and Respondent Quality
ANA and respondent quality with homogeneous pathologies:
ANA and respondent quality with heterogeneous pathologies:
ANA, Screening, and Respondent Quality
ANA, screening, and respondent quality with homogeneous pathologies:
ANA, screening, and respondent quality with heterogeneous pathologies:
The text was updated successfully, but these errors were encountered: