Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FileFormats.MPS]: avoid creating list of variable names #2426

Merged
merged 5 commits into from
Feb 14, 2024
Merged

Conversation

mlubin
Copy link
Member

@mlubin mlubin commented Feb 10, 2024

The MPS writer has quadratic-ish performance because it creates GC-tracked strings for every variable and constraint. This PR shows one way we could skip this: generate the generic names on demand.

I didn't implement it for constraints because they need fancier code to deal with the constraint type parameters efficiently.

Potential alternatives:

  • Use fixed-length strings as provided by WeakRefStrings
  • Use WeakRefStrings.StringArray

Before:

148.861351 seconds (472.59 M allocations: 38.948 GiB, 40.91% gc time, 2.87% compilation time)

After:

109.157014 seconds (552.84 M allocations: 40.555 GiB, 21.95% gc time, 3.86% compilation time)

If we extrapolate, the version that does this approach for constraints would be around 70 sec, so we'd get a 2x speed up on this one example and better asymptotic performance.

@odow
Copy link
Member

odow commented Feb 10, 2024

I didn't think to test this. I just assumed that one list of strings would be better than needing to repeatedly query VariableName. Julia really doesn't like lots of Strings...

@mlubin
Copy link
Member Author

mlubin commented Feb 10, 2024

Querying VariableName is a problem too, that still forces you to create one string per variable.

@odow
Copy link
Member

odow commented Feb 10, 2024

that still forces you to create one string per variable.

Yeah, which explains the increase in allocations. But I guess they are short-lived. The GC just doesn't like long-lived GC objects.

@mlubin
Copy link
Member Author

mlubin commented Feb 10, 2024

To clarify, the issue isn't creating one string per variable, the issue is having O(num_variable) strings in existence simultaneously.

@odow
Copy link
Member

odow commented Feb 10, 2024

the issue is having O(num_variable) strings in existence simultaneously.

Yeah.

I didn't implement it for constraints

Is there a similar thing to do for the constraints? Perhaps just changing the generic_names?

@mlubin
Copy link
Member Author

mlubin commented Feb 10, 2024

We need a type-stable function mapping a ConstraintIndex{F,S} to its generic name.

@mlubin
Copy link
Member Author

mlubin commented Feb 11, 2024

I hacked at this a bit more, but the constraint side is harder than I expected. The problem is the string identifiers used in the constraint matrix transpose code:

coefficients = Vector{Tuple{String,Float64}}[
Tuple{String,Float64}[] for _ in ordered_names
]

I tried something like:

Dict{Tuple{Type,Type}, Vector{Vector{Tuple{Int64, Float64}}}}()

where we first breakdown by (F,S) then list the coefficients by column. This isn't great because requires a fair amount more memory usage and didn't seem to make things faster anyway.

We really just want to have all constraint types to share the same index space and just do a sparse matrix transpose to get the column-wise format, similar to the type of manipulations that solver interfaces do.

That's about as far as I can push for now.

@odow
Copy link
Member

odow commented Feb 11, 2024

The flat linear constraint space was what I started to do in #2422.

But now I know the big Vector{String}s are a bad idea, let me have another go. It's probably why I got only a modest improvement.

@odow odow marked this pull request as ready for review February 13, 2024 04:18
@odow odow changed the title MPS: avoid creating list of variable names [FileFormats.MPS]: avoid creating list of variable names Feb 13, 2024
Copy link
Member

@odow odow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I made a small change. You should also try JuMP.set_string_names_on_creation(model, false).

@mlubin
Copy link
Member Author

mlubin commented Feb 13, 2024

Is the plan to land this instead of #2422?

@odow
Copy link
Member

odow commented Feb 13, 2024

Is the plan to land this instead of #2422?

I think yes. For now, this PR is an improvement. We should only move forward with #2422 if it is demonstrably better than this. #2422 is also missing a lot of things like support for quadratics, SOS, indicator, and reading.

@mlubin
Copy link
Member Author

mlubin commented Feb 14, 2024

Fair enough. Do we know what's going on with the nightly CI failure?

@odow
Copy link
Member

odow commented Feb 14, 2024

No, there have been a few failures. Also in MutableArithmetics: https://github.com/jump-dev/MutableArithmetics.jl/actions/runs/7892806791/job/21540090078?pr=257. I'll take a look.

@odow odow mentioned this pull request Feb 14, 2024
7 tasks
@mlubin mlubin merged commit 72d23e5 into master Feb 14, 2024
13 of 16 checks passed
@mlubin mlubin deleted the ml/mps branch February 14, 2024 02:49
@odow
Copy link
Member

odow commented Feb 14, 2024

Your first commit to MOI in a while!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants