Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the factor databases #14

Open
Stefan-Kohl opened this issue Jan 19, 2018 · 12 comments
Open

Update the factor databases #14

Stefan-Kohl opened this issue Jan 19, 2018 · 12 comments

Comments

@Stefan-Kohl
Copy link
Member

The factor databases shipped with FactInt have not been updated since years. For example, the database of factors of integers of the form b^k +/- 1 has been updated for the last time in June 2011. Depending on how much people need additional factorizations of such numbers which have been found since then in the corresponding factorization projects, one could update the databases again. Of course one needs to take into account the additional storage consumption when storing more factors.

By the way, the URL of the database of factors of integers of the form b^k +/- 1 has changed since its mentioning in lib/general.gi. It is now http://myfactors.mooo.com/ .

@frankluebeck
Copy link
Member

A slightly improved FetchBrentFactors and an update of the data is provided in the pull request "Update Brent factors" (#23).

This data collection has grown considerably since its last update in FactInt. In my applications I only need the factors of p^k - 1 where p is a prime. But the other factors may be useful in other applications. The source of the data is growing and updated by Jonathan Crombie every few months. I'm not sure if storage space is a concern nowadays.

@olexandr-konovalov
Copy link
Member

olexandr-konovalov commented Aug 13, 2021

Oh that's nice @frankluebeck, thanks!

@olexandr-konovalov
Copy link
Member

One can possibly make .gz of all files in tables/brent to address the problem of growing space (CC @fingolfin)

@frankluebeck
Copy link
Member

Yes, gzipping all files in that directory shrinks it from 350 MB to 130 MB. But this does not shrink the distribution archive.
Is the distribution size really a problem today?
The applications I'm aware of are finding multiplicative orders of finite field elements and, more generally, matrices over finite fields. For this it would be sufficient to have the Brent data for prime bases (numbers p^k-1, p prime). These data need 46 MB (17 MB gzipped).

One could download and cache the files only when they are needed, similar to data in AtlasRep and maybe provide a utility function to download everything. But that is more work. (This approach could be useful for other packages as well).

@olexandr-konovalov
Copy link
Member

If we want to make a new release incorporating PR #23, we should be ready to the 20-fold increase of the package distribution size. FactInt-master.zip is now 116 Mb, and tar.gz will be smaller, but still a lot. To compare, FactInt-1.6.3.zip was 5 Mb.

@fingolfin
Copy link
Member

Yeah this growth is really unfortunate. We just spent a lot of effort together with the simpcomp maintainer to slim down their package, and this is all eaten up (and more) by the growth of FactInt, for a change that I think is not that useful for most GAP users (no offense intended! I am sure it's very useful for something, just not for "most").

Some other package download such data only on demand, perhaps FactInt could do something like this?

And of course we may need the whole concept of our package distribution...

@hulpke
Copy link

hulpke commented Apr 29, 2022

The same issue is avoided in transgrp by providing groups of degree 32 and 48 only as optional download. I think the same would be sensible here. Storage space still is an issue for users in a a managed (ie.e. a sysadmin who needs to approve) setting.

@frankluebeck
Copy link
Member

Above I have made one proposal to reduce the distribution size (to about 13% of the discussed growth) by only including factors where the base is a prime.

Another possibility: do not distribute any new factors, and instead add and document a utility function like

UpdateBrentFactors := function()
  FetchBrentFactors(
       "https://maths-people.anu.edu.au/~brent/ftp/factors/factors.gz",
       false);
  FetchBrentFactors(
     "http://myfactorcollection.mooo.com:8090/brentdata/Aug4_2021/factors.gz",
     true);
end;

(the date part of the latter call must be updated from time to time)
This function must be called by a user with write permissions in the directory <FactIntDir>/tables/brent/, it only works on UNIX, and needs the external program curl.

PS: I think "is not that useful for most GAP users" is not a valid criterion, as long as "most GAP users" does not mean "all GAP users", because this holds for almost everthing we distribute with GAP.

@olexandr-konovalov
Copy link
Member

olexandr-konovalov commented May 3, 2022

Ok, so I am delaying the release until this will be resolved, in one way or another.

@olexandr-konovalov
Copy link
Member

It would be nice to tackle the issue of the large increase of the FactInt download. I would not like to make a release until then.

@frankluebeck
Copy link
Member

A decision is needed about a default set of factors distributed with the package. This could be just a set distributed with former versions of FactInt.
Conceptually nicer: define and document a set of bases x and maximal exponents k such that all known factors of x^i - 1 for i <= k are included in the distribution. Of course, this would need to be updated from time to time.

The database of factors can be extended to all known factors of numbers of this kind by the following function:

LoadPackage("StandardFF");
UpdateBrentFactors := function()
  FetchMoreFactors(
       "https://maths-people.anu.edu.au/~brent/ftp/factors/factors.gz",
       false);
  FetchMoreFactors(
     "http://myfactorcollection.mooo.com:8090/brentdata/Dec31_2022/factors.gz",
     true);
end;

Instead of FetchMoreFactors from the StandardFF package one could use the improved FetchBrentFactors from b1621ab.

@olexandr-konovalov
Copy link
Member

I would go for an additional set to download from FactInt page and instructions how to install it. Downloading from URLs of others where we don't have any control is not sustainable in the long run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants