Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: various minor fixes #530

Merged
merged 23 commits into from
Apr 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
acd91f1
fix: simplifyModel w/ backwards-only irrev rxns
edkerk Feb 26, 2024
ee42300
fix: writeYAMLmodel do not write empty lines
edkerk Mar 8, 2024
acd3cd0
fix: getModelFromKEGG essential fields
edkerk Mar 8, 2024
42c809d
feat: printFluxes better error with empty fluxes
edkerk Mar 8, 2024
3c078ab
fix: getGenesFromGrRules if '|' in gene
edkerk Mar 8, 2024
4d7853d
fix: getGenesFromGrRules correct earlier fix
edkerk Mar 11, 2024
48ff147
fix: getModelFromHomology remove geneFrom field
edkerk Mar 14, 2024
ff086b4
refactor: reduce default verbosity
edkerk Mar 14, 2024
4f4354a
fix: writeYAMLmodel allow empty id and name fields
edkerk Mar 14, 2024
2966725
doc: updateDocumentation
edkerk Mar 14, 2024
f8b0c44
fix: ravenCobraWrapper prefers to use grRules
edkerk Mar 14, 2024
10d0055
feat: optimizeProb error solving milp with glpk
edkerk Mar 27, 2024
9421480
fix: adjust tests to separate solvers
edkerk Mar 27, 2024
3ce4fd8
fix: writeYAMLmodel error if dir does not exist
edkerk Mar 28, 2024
027b433
feat: parallel randomSampling
edkerk Apr 11, 2024
70fe6d6
feat: setParam has 'unc' as option
edkerk Apr 11, 2024
f2b4a23
feat: parallelPoolRAVEN handles parallelization
edkerk Apr 11, 2024
8918962
feat: use alternative ProgressBar function
edkerk Apr 11, 2024
f2c8278
refactor: predictLocalization builtin randsample
edkerk Apr 25, 2024
5af66b4
fix: mapCompartments correct horizontal concatenation
simas232 Apr 26, 2024
b2ded2e
feat: give execution rights in Terminal for new RAVEN functions
simas232 Apr 26, 2024
5f723c7
doc: update GLPK readme file
simas232 Apr 26, 2024
9bc343f
feat: add GLPK compatibility with Apple Silicon
simas232 Apr 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file modified core/convertCharArray.m
100644 → 100755
Empty file.
Empty file modified core/findRAVENroot.m
100644 → 100755
Empty file.
116 changes: 41 additions & 75 deletions core/getAllowedBounds.m
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,30 @@
% Returns the minimal and maximal fluxes through each reaction.
%
% Input:
% model a model structure
% rxns either a cell array of reaction IDs, a logical vector with the
% same number of elements as reactions in the model, or a vector
% of reaction indexes (opt, default model.rxns)
% runParallel make use of MATLAB parallel pool to speed up calculations. Not
% beneficial if only a limited number of reactions are simulated.
% (opt, default true)
% model a model structure
% rxns either a cell array of reaction IDs, a logical vector
% with the same number of elements as reactions in the
% model, or a vector of reaction indexes (opt, default
% model.rxns)
% runParallel speed up calculations by parallel processing. This is
% not beneficial if allowed bounds are calculated for
% only a few reactions, as the overhead of parallel
% processing will take longer. It requires MATLAB
% Parallel Computing Toolbox. If this is not installed,
% the calculations will not be parallelized, regardless
% what is indicated as runParallel. (opt, default true)
%
% Output:
% minFluxes minimal allowed fluxes
% maxFluxes maximal allowed fluxes
% exitFlags exit flags for min/max for each of the reactions. True if it was
% possible to calculate a flux
% minFluxes minimal allowed fluxes
% maxFluxes maximal allowed fluxes
% exitFlags exit flags for min/max for each of the reactions. True
% if it was possible to calculate a flux
%
% NOTE: In cases where no solution can be calculated, NaN is returned.
%
% Usage: [minFluxes, maxFluxes, exitFlags] = getAllowedBounds(model, rxns, runParallel)

if nargin<2
if nargin<2 || isempty(rxns)
rxns = 1:numel(model.rxns);
elseif ~islogical(rxns) && ~isnumeric(rxns)
rxns = convertCharArray(rxns);
Expand All @@ -30,78 +35,39 @@
if nargin<3
runParallel = true;
end
if runParallel
addonList = matlab.addons.installedAddons;
if ~any(strcmpi(addonList.Name,'Parallel Computing Toolbox'))
disp('Cannot find MATLAB Parallel Computing Toolbox, process is not parallelized.')
runParallel = false;
end
end

[ps, oldPoolAutoCreateSetting] = parallelPoolRAVEN(runParallel);

minFluxes = zeros(numel(rxns),1);
maxFluxes = zeros(numel(rxns),1);
exitFlags = zeros(numel(rxns),2);
c = zeros(numel(model.rxns),1);

p = 1;
progressbar('Calculating the minimal and maximal fluxes')
if runParallel
D = parallel.pool.DataQueue;
afterEach(D, @updateProgressParallel);
parfor i = 1:numel(rxns)
tmpModel = model;
tmpModel.c = c;

% Get minimal flux
tmpModel.c(rxns(i)) = -1;
solMin = solveLP(tmpModel);
if ~isempty(solMin.f)
minFluxes(i) = solMin.x(rxns(i));
else
minFluxes(i) = NaN;
end
PB = ProgressBar2(numel(rxns),'Running getAllowedBounds','cli');
parfor i = 1:numel(rxns)
count(PB)
tmpModel = model;
tmpModel.c = c;

% Get maximal flux
tmpModel.c(rxns(i)) = 1;
solMax=solveLP(tmpModel);
exitFlags(i,:) = [solMin.stat solMax.stat];
if ~isempty(solMax.f)
maxFluxes(i) = solMax.x(rxns(i));
else
maxFluxes(i) = NaN;
end
send(D, i);
% Get minimal flux
tmpModel.c(rxns(i)) = -1;
solMin = solveLP(tmpModel);
if ~isempty(solMin.f)
minFluxes(i) = solMin.x(rxns(i));
else
minFluxes(i) = NaN;
end
else
for i = 1:numel(rxns)
tmpModel = model;
tmpModel.c = c;

% Get minimal flux
tmpModel.c(rxns(i)) = -1;
solMin=solveLP(tmpModel);
if ~isempty(solMin.f)
minFluxes(i) = solMin.x(rxns(i));
else
minFluxes(i) = NaN;
end

% Get maximal flux
tmpModel.c(rxns(i)) = 1;
solMax = solveLP(tmpModel);
exitFlags(i,:) = [solMin.stat solMax.stat];
if ~isempty(solMax.f)
maxFluxes(i) = solMax.x(rxns(i));
else
maxFluxes(i) = NaN;
end
progressbar(i/numel(rxns))
% Get maximal flux
tmpModel.c(rxns(i)) = 1;
solMax=solveLP(tmpModel);
exitFlags(i,:) = [solMin.stat solMax.stat];
if ~isempty(solMax.f)
maxFluxes(i) = solMax.x(rxns(i));
else
maxFluxes(i) = NaN;
end
end
progressbar(1) % Make sure it closes

function updateProgressParallel(~)
progressbar(p/numel(rxns))
p = p + 1;
end
% Reset original Parallel setting
ps.Pool.AutoCreate = oldPoolAutoCreateSetting;
end
5 changes: 3 additions & 2 deletions core/getGenesFromGrRules.m
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
end

% check if the grRules use written or symbolic boolean operators
if any(contains(grRules,{'&','|'}))
if any(contains(grRules,{' & ',' | '}))
% fix some potential missing spaces between parentheses and &/|
grRules = regexprep(grRules,'\)&',') &'); % ")&" -> ") &"
grRules = regexprep(grRules,'&\(','& ('); % "&(" -> "& ("
Expand All @@ -50,11 +50,12 @@
end

% extract list of genes from each reaction
rxnGenes = cellfun(@(r) unique(regexp(r,'[^&|\(\) ]+','match')),grRules,'UniformOutput',false);
rxnGenes = cellfun(@(r) regexprep(unique(strsplit(r,{' | ',' & '})),'[\(\) ]+',''),grRules,'UniformOutput',false);

% construct new gene list
nonEmpty = ~cellfun(@isempty,rxnGenes);
genes = unique([rxnGenes{nonEmpty}]');
genes(cellfun(@isempty,genes)) = [];

if ~isempty(originalGenes)
if ~isequal(sort(originalGenes), sort(genes))
Expand Down
2 changes: 1 addition & 1 deletion core/getMinNrFluxes.m
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@
prob=rmfield(prob,{'blx','bux','blc','buc'});

% Optimize the problem
res = optimizeProb(prob,params);
res = optimizeProb(prob,params,false);
isFeasible=checkSolution(res);

if ~isFeasible
Expand Down
5 changes: 5 additions & 0 deletions core/getModelFromHomology.m
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,11 @@
if isfield(models{i},'geneMiriams')
models{i}=rmfield(models{i},'geneMiriams');
end
%The geneFrom field also loses meaning if the genes are replaced by
%orthologs
if isfield(models{i},'geneFrom')
models{i}=rmfield(models{i},'geneFrom');
end
end

%Check that genes do not begin with ( or end with ), as this makes problematic grRules
Expand Down
2 changes: 1 addition & 1 deletion core/mapCompartments.m
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@
%Check that there are no compartments in the rules that are not in the
%geneScoreStructure.
uComps=upper(geneScoreStructure.compartments);
J=[uComps;{'OTHER'}];
J=[uComps,{'OTHER'}];

if ~isempty(setdiff([toKeep;toMerge],J))
EM='There are compartment in the rules that are not in geneScoreStructure.compartments';
Expand Down
53 changes: 53 additions & 0 deletions core/parallelPoolRAVEN.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
function [ps, oldPoolAutoCreate] = parallelPoolRAVEN(runParallel)
% handleParallelRAVEN
% Called by RAVEN functions that support parallel processing, to confirm
% whether the MATLAB Parallel Computing Toolbox is installed.
% - The toolbox is installed, and runParallel == true, ==> a parallel
% pool is started.
% - The toolbox is installed, but runParallel == false, ==> the auto-
% creation of a parallel pool is disabled, to prevent "parfor" in
% the target function to start a pool anyway.
% - The toolbox is not installed, and runParallel == true, ==> a warning
% is displayed that parallel computer is not possible.
% - The toolbox is not installed, and runParallel == false, ==> the
% target runs as intended, as "parfor" will automatically run in serial
% mode instead.
%
% Input:
% runParallel logical, whether the target function (which calls
% parallelPoolRAVEN) should be run in parallel (opt,
% default true)
%
% Output:
% ps parallel settings structure that will be used by
% the target function
% oldPoolAutoCreate logical, to reset the original ps.Pool.AutoCreate
% setting once the target function has finished
%
% Use: [ps, oldPoolAutoCreate] = parallelPoolRAVEN(runParallel)

if nargin<1 || isempty(runParallel)
runParallel = true;
end

addonList = matlab.addons.installedAddons;
ps = []; oldPoolAutoCreate = [];
if ~any(strcmpi(addonList.Name,'Parallel Computing Toolbox'))
if runParallel % User wants parallel, but will not be possible
disp('Cannot find MATLAB Parallel Computing Toolbox, process is not parallelized.')
end
else
if ~runParallel % User has Parallel toolbox, but does not want pool to start.
% If pool is already running, then it will use the max workers
% anyway, but this is probably okay.
ps = parallel.Settings;
oldPoolAutoCreate = ps.Pool.AutoCreate;
ps.Pool.AutoCreate = false;
else
pool = gcp('nocreate');
if isempty(pool)
parpool('IdleTimeout',120)
end
end
end
end
119 changes: 119 additions & 0 deletions core/predictLocalization.m
Original file line number Diff line number Diff line change
Expand Up @@ -1005,3 +1005,122 @@
transportCost=sum(transportCost(I));
score=geneScore-transportCost;
end

% To avoid dependency on stats toolbox, use this alternative implementation
% of randsample, source:
% https://github.com/gpeyre/numerical-tours/blob/dacee30081c04ef5f67b26b387ead85f2b193af9/matlab/toolbox_signal/randsample.m
function y = randsample(n, k, replace, w)
%RANDSAMPLE Random sample, with or without replacement.
% Y = RANDSAMPLE(N,K) returns Y as a vector of K values sampled uniformly
% at random, without replacement, from the integers 1:N.
%
% Y = RANDSAMPLE(POPULATION,K) returns K values sampled uniformly at
% random, without replacement, from the values in the vector POPULATION.
%
% Y = RANDSAMPLE(...,REPLACE) returns a sample taken with replacement if
% REPLACE is true, or without replacement if REPLACE is false (the default).
%
% Y = RANDSAMPLE(...,true,W) returns a weighted sample, using positive
% weights W, taken with replacement. W is often a vector of probabilities.
% This function does not support weighted sampling without replacement.
%
% Example: Generate a random sequence of the characters ACGT, with
% replacement, according to specified probabilities.
%
% R = randsample('ACGT',48,true,[0.15 0.35 0.35 0.15])
%
% See also RAND, RANDPERM.

% Copyright 1993-2008 The MathWorks, Inc.
% $Revision: 1.1.4.3 $ $Date: 2008/12/01 08:09:34 $

if nargin < 2
error('stats:randsample:TooFewInputs','Requires two input arguments.');
elseif numel(n) == 1
population = [];
else
population = n;
n = numel(population);
if length(population)~=n
error('stats:randsample:BadPopulation','POPULATION must be a vector.');
end
end

if nargin < 3
replace = false;
end

if nargin < 4
w = [];
elseif ~isempty(w)
if length(w) ~= n
if isempty(population)
error('stats:randsample:InputSizeMismatch',...
'W must have length equal to N.');
else
error('stats:randsample:InputSizeMismatch',...
'W must have the same length as the population.');
end
else
p = w(:)' / sum(w);
end
end

switch replace

% Sample with replacement
case {true, 'true', 1}
if isempty(w)
y = ceil(n .* rand(k,1));
else
[dum, y] = histc(rand(k,1),[0 cumsum(p)]);
end

% Sample without replacement
case {false, 'false', 0}
if k > n
if isempty(population)
error('stats:randsample:SampleTooLarge',...
'K must be less than or equal to N for sampling without replacement.');
else
error('stats:randsample:SampleTooLarge',...
'K must be less than or equal to the population size.');
end
end

if isempty(w)
% If the sample is a sizeable fraction of the population,
% just randomize the whole population (which involves a full
% sort of n random values), and take the first k.
if 4*k > n
rp = randperm(n);
y = rp(1:k);

% If the sample is a small fraction of the population, a full sort
% is wasteful. Repeatedly sample with replacement until there are
% k unique values.
else
x = zeros(1,n); % flags
sumx = 0;
while sumx < k
x(ceil(n * rand(1,k-sumx))) = 1; % sample w/replacement
sumx = sum(x); % count how many unique elements so far
end
y = find(x > 0);
y = y(randperm(k));
end
else
error('stats:randsample:NoWeighting',...
'Weighted sampling without replacement is not supported.');
end
otherwise
error('stats:randsample:BadReplaceValue',...
'REPLACE must be either true or false.');
end

if ~isempty(population)
y = population(y);
else
y = y(:);
end
end
5 changes: 4 additions & 1 deletion core/printFluxes.m
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,10 @@ function printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile,outputS
else
metaboliteList=convertCharArray(metaboliteList);
end
if size(fluxes,1)~=numel(model.rxns)
if isempty(fluxes)
EM='Empty vector of fluxes, solveLP possibly returned infeasible';
dispEM(EM);
elseif size(fluxes,1)~=numel(model.rxns)
EM='The number of fluxes and the number of reactions must be the same';
dispEM(EM);
end
Expand Down
Empty file modified core/printOrange.m
100644 → 100755
Empty file.
Loading
Loading