-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StrainDesign compatibility issues with enzyme-constrained model #18
Comments
Thank you very much for the details. This was very helpful for finding fixing the issue.
The problem is that reactions may get a negative sign and "produce" enzymes. The code below should help you fix this. Until enzyme constraints are integrated in StrainDesign, just use this.
|
Something is going on between ECMpy and cobrapy.
For StrainDesign, we keep the reactions from cobrapy (1) and (3) with correct bounds. Since the reverse reaction has a negative value, reactions (3) can "buy" us enzyme. We can fix this by using a negative enzyme coefficient for the reverse sense, as suggested above. For ECMpy, the reactions kept are (1) and (4). This does not buy us enzyme, but we get reaction (4) for free. I have raised an issue on their github. (tibbdc/ECMpy#1 (comment)) For now, I will probably wait to see how they fix this. I advise you to use the solution I posted before, since you won't get the reverse reactions for free. Should be more accurate. |
If you want to use ECMpy with StrainDesign, also try my fork of ECMpy: https://github.com/VonAlphaBisZulu/ECMpy/tree/master |
Hi Philipp, thank you very much for your extremely prompt response and help. I now have got the chance to test out both of your suggested solutions:
I haven't encountered any other problems, so thank you once again for your massive help with this and the working solutions. |
Hi Jakub, that's great!
|
Hi Philipp, thank you for the information about the ECMpy update. I now tried using StrainDesign with the updated ECMpy code and model, but it seems to have the same original problem of not registering the enzyme constraints. So for now I'm going to stick to using your repository with the modified original ECMpy code that already works for me. Below is a code that should reproduce the error ("ValueError: 'HMPK1_num1' is not in list") I'm getting in the fix scenario 1 discussed above (i.e. using constraints=[coeff_dict2,'<=',rhs]):
|
Thanks, this is very helpful. Hopefully, I can fix this some time soon. |
Hello both :) thank you very much for this elucidating discussion on the issue. I've been thinking about the second fix you proposed (where you implement enzyme constraints as metabolites and reactions). I have been wondering, a key motivating factor in the development of ECMpy is the fact that for other enzyme constraint models like GECKO "introducing the enzyme constraints into the original metabolic model using GECKO needs to be extensively revised by modifying every metabolic reaction with a pseudo-metabolite representing an enzyme and adding hundreds of exchange reactions for enzymes, which is complex and significant increases the model size." (quote from the paper). The idea was that: "we propose a simpler workflow called ECMpy by explicitly introducing an enzyme constraint without modifying existing metabolic reactions or adding new reactions". My question is, am I right in thinking that this workaround essentially undoes that advantage? And make this method more like GECKO (probably specifically GECKO light https://www.nature.com/articles/s41596-023-00931-7 )? Thanks for any help, |
Hey Anna, That is a great question. TLDR: Proposed fix 2 is equivalent to GECKO/ECM, and the key difference is merely a technical one. Performance differences should not exist. Here is why: "hundreds of exchange reactions for enzymes" If the reactions are enzyme constraint, these reactions 'consume' the enzymes (icl and pfl) and then read: ICL: 2* enz_icl + icit_c → glx_c + succ_c so that's n more metabolites and n more reactions and 1 more inequality Here, the magic of GECKO and ECM comes in. They combine all coefficients reaction-wise and only introduce a single inequality to the model, which represents the enzyme-availability. The pre-determined factors that are multiplied to the reactions, take into account the enzyme cost to keep the reaction running per unit of reaction rate (I'm not familiar with the details). The inequality then reads:
Why implement the enzyme-constraint as a pseudo-reaction instead of a constraint:
Yes, they look different, for sure. But since inequalities, like the one used by ECM, are often resolved with slack variables by LP solvers (which read just like an unbound variable in the LP matrix), complexity should not change. You actually don't add any real vertices or degrees of freedom, and after preprocessing by the LP solver, the matrices of both problems might even look 100% identical. Even if the solver fails to iron it out, a genome-scale model with one more reaction should be still fine. Definitely a price I'm willing to pay to make EC models usable within the full COBRA ecosystem. I still have a pull request open (tibbdc/ECMpy#5) that I hope the ECM people accept. That doesn't seem to happen anytime soon. So if you'd like to use ECM with StrainDesign, feel free to use my fork of ECM (https://github.com/VonAlphaBisZulu/ECMpy) that has is implemented that way. |
Thank you so much for that detailed explanation! |
Hi, StrainDesign has worked very well for me when used on basic stoichiometric genome-scale models (e.g. iML1515). However, I am encountering problems when I try to use it on an enzyme-constrained model, specifically eciML1515 from this ECMpy study https://doi.org/10.3390/biom12010065. Unlike standard cobrapy library functions, StrainDesign seems to not be able to pick up on the additional enzyme constraints included in the model and, as a result, the sd.fba, sd.fva and sd.plot_flux_space functions simulate flux distributions identical to those obtained with the basic stoichiometric model, ignoring the enzyme constraints.
I have not been able to overcome this issue on my own. Looking into the source code, it seems to me these constraints get ignored during the "A_eq_base = sparse.csr_matrix(create_stoichiometric_matrix(model))" step within the fva, fba and yopt functions in the straindesign.lptools module, but I might be wrong.
I tried providing the enzyme constraints obtained from "model.constraints" directly into the constraints parameter within the "sd.fba(model, constraints=...)" function, all in a straindesign-friendly format (e.g. '0.00860581989817056 13PPDH2 + 0.00805572834477629 13PPDH2_reverse + 0.00134590026565838 23PDE2pp + ... <= 0.227'), but that didn't work either.
It would be great if there was a way to consider additional enzyme-constraints in straindesign simulations and strain optimization methods.
Below, I included a simple test to reproduce the different treatment of enzyme-constrained model eciML1515 by standard cobrapy and by straindesign, which can be launched from the ECMpy repository folder here https://github.com/tibbdc/ECMpy/tree/master.
import straindesign as sd
import cobra
import sys
sys.path.append(r'./code/')
from cobrapy_ec_model_function import *
#load enzyme-constrained model
json_model_path="./model/iML1515_irr_enz_constraint_adj_round2.json"
enz_model=get_enzyme_constraint_model(json_model_path)
#load stoichiometric model
norm_model=cobra.io.json.load_json_model(json_model_path)
#FBA via cobra - differentiates between enzyme-constrained and basic model
fba_sol__cobra_enz = enz_model.optimize()
print(fba_sol__cobra_enz.fluxes['BIOMASS_Ec_iML1515_core_75p37M'], fba_sol__cobra_enz.fluxes['EX_ac_e'])
fba_sol__cobra_norm = norm_model.optimize()
print(fba_sol__cobra_norm.fluxes['BIOMASS_Ec_iML1515_core_75p37M'], fba_sol__cobra_norm.fluxes['EX_ac_e'])
#0.6802441208853646 7.711579625195041
#0.8697726420320148 0.0
#FBA via straindesign - simulates enzyme-constrained model identically to basic model
fba_sol__sd_enz = sd.fba(enz_model)
print(fba_sol__sd_enz.fluxes['BIOMASS_Ec_iML1515_core_75p37M'], fba_sol__sd_enz.fluxes['EX_ac_e'])
fba_sol__sd_norm = sd.fba(norm_model)
print(fba_sol__sd_norm.fluxes['BIOMASS_Ec_iML1515_core_75p37M'], fba_sol__sd_norm.fluxes['EX_ac_e'])
#0.869772642032 0.0
#0.869772642032 0.0
#straindesign production envelopes also look identical for both enzyme-constrained and basic model
sd.plot_flux_space(enz_model,('BIOMASS_Ec_iML1515_core_75p37M','EX_ac_e'));
sd.plot_flux_space(norm_model,('BIOMASS_Ec_iML1515_core_75p37M','EX_ac_e'));
The text was updated successfully, but these errors were encountered: