A fast prime gap searching suite.
Table of contents generated with markdown-toc
This is a suite of tools (combined_sieve
, gap_stats
, gap_test.py
and gap_test_gpu
) which are useful for searching for prime gaps
centered around m * P#/d
.
The flow is to sieve many intervals with combined_sieve --save -u <filename>
Then calculate statistics about these intervals with gap_stats --save -u <filename>
Then test some portion of these intervals with gap_test.py -u <filename>.txt --prp-top-percent 25
To quickly test a known 30 merit prime gap (README.md Setup must already be complete)
$ make
$ sqlite3 test.db < schema.sql
$ FN=809_1841070_11244000_1000_s15000_l200M.txt
$ ./combined_sieve --save --search-db test.db -u $FN
$ ./gap_stats --save --search-db test.db -u $FN --min-merit 20
$ ./gap_test.py --search-db test.db -u $FN
...
23142 30.1485 11244911 * 809#/1841070 -5788 to +17354 RECORD!
...
combined_sieve
parameter --method1
changes how the sieve is performed.
$ make combined_sieve
$ time ./combined_sieve [--method1] -p <p> -d <d> --mstart <m_start> --minc <m_inc> --max-prime <LIMIT>M [--sieve-length <SR>] --save-unknowns
Method1 performs a large initial setup phase to find the first mi
that each prime
divides before starting any sieving.
Each mi
is then processed in order, looking at small primes and only those large primes that divide a number in the sieve interval.
A large percent of time will be spent watching dots cross this screen.
Calculating first m each prime divides
.
(later)
.........
(eventually)
......................................
Method2 inverts this process and keeps all sieve intervals in memory at the same time. Primes are iterated in order and all sieve intervals are handled simultaneously.
In practice Method2 is better. Method1 is only used to validate results.
- Status output is significantly nicer in Method2
- Stats are computed at regular intervals giving better insight into
--max-prime
. A - Method2 allows for early stopping (set a very large
--max-prime
and press Ctrl+C once when you want to stop) - Method2 is slightly faster because it use
modulo_search_euclid_all(...)
then a lookup for precomputed gcd.- Method1 uses
modulo_search_euclid_gcd(...)
which callsgcd(m, D)
which is significantly slower than lookup.
- Method1 uses
$ make combined_sieve
$ time ./combined_sieve -p 907 -d 210 --mstart 1 --minc 1000 --max-prime 1000 --save-unknowns --method1 -q
combined_sieve --method1 output
AUTO SET: sieve length: 7554 (coprime: 340, prob_gap longer 0.79%)
Testing m * 907#/210, m = 1 + [0, 1,000)
Using 33860 primes for SMALL_PRIME_LIMIT(400,000)
Calculating first m each prime divides
......................................
Sum of m1: 563865384
PrimePi(1000000000) = 50847534
Setup took 6.4 seconds
Saving unknowns to '907_210_1_1000_s7554_l1000M.m1.txt'
1 104 <- unknowns -> 123
intervals 1 (1623.41/sec, with setup per m: 8.8) 0 seconds elapsed
unknowns 227 (avg: 227.00), 98.50% composite 45.81 <- % -> 54.19%
large prime remaining: 1220238 (avg/test: 7115)
41 118 <- unknowns -> 113
intervals 10 (677.19/sec, with setup per m: 0.88) 0 seconds elapsed
unknowns 2264 (avg: 226.40), 98.50% composite 50.27 <- % -> 49.73%
large prime remaining: 1180663 (avg/test: 7127)
437 118 <- unknowns -> 112
intervals 100 (621.68/sec, with setup per m: 0.089) 0 seconds elapsed
unknowns 22819 (avg: 228.19), 98.49% composite 49.61 <- % -> 50.39%
large prime remaining: 740948 (avg/test: 7146)
997 120 <- unknowns -> 114
intervals 228 (747.02/sec, with setup per m: 0.04) 0 seconds elapsed
unknowns 52104 (avg: 228.53), 98.49% composite 49.78 <- % -> 50.22%
large prime remaining: 0 (avg/test: 7155)
real 0m6.671s
$ make combined_sieve
$ time ./combined_sieve -p 907 -d 210 --mstart 1 --minc 1000 --max-prime 1000 --save-unknowns
combined_sieve --method2 output
Testing m * 907#/210, m = 1 + [0, 1,000)
sieve_length: 2x 7,554
max_prime: 1,000,000,000
coprime m 228/1000, coprime i 1991/7554, ~0MB
coprime wheel 346/7554, ~0MB
907 (primes 155/155) (seconds: 0.00/0.0 | per m: 5.39e-07)
factors 0 (interval: 0 avg m/large_prime interval: 0.0)
unknowns 157,918/228 (avg/m: 692.62) (composite: 95.42% +95.416% +3,286,934)
~ 2x 71.12 PRP/m (~ 360955.6 skipped PRP => 2939330029.3 PRP/seconds)
100,000 (primes 8,363/9593) (seconds: 0.04/0.0 | per m: 0.000194)
factors 82,189 (interval: 35,101 avg m/large_prime interval: 296.4)
unknowns 94,001/228 (avg/m: 412.29) (composite: 97.27% +97.271% +3,350,851)
~ 2x 42.07 PRP/m (~ 374203.7 skipped PRP => 8452672.8 PRP/seconds)
...
1,000,000 (primes 36,960/78499) (seconds: 0.03/0.1 | per m: 0.000559)
factors 110,815 (interval: 8,186 avg m/large_prime interval: 21.1)
unknowns 78,446/228 (avg/m: 344.06) (composite: 97.72% +0.121% +4,167)
~ 2x 35.06 PRP/m (~ 844.5 skipped PRP => 27948.8 PRP/seconds)
...
100,000,000 (primes 544,501/5761456) (seconds: 0.09/1.2 | per m: 0.00511)
factors 156,334 (interval: 907 avg m/large_prime interval: 0.2)
unknowns 58,659/228 (avg/m: 257.28) (composite: 98.30% +0.009% +325)
~ 2x 26.29 PRP/m (~ 69.0 skipped PRP => 785.3 PRP/seconds)
...
999,999,937 (primes 4,838,319/50847535) (seconds: 0.84/8.4 | per m: 0.0368)
factors 174,997 (interval: 812 avg m/large_prime interval: 0.0)
unknowns 52,104/228 (avg/m: 228.53) (composite: 98.49% +0.008% +290)
~ 2x 23.37 PRP/m (~ 54.5 skipped PRP => 64.6 PRP/seconds)
Estimated ~36.4x faster to just run PRP now (CTRL+C to stop sieving)
Saving unknowns to '907_210_1_1000_s7554_l1000M.txt'
real 0m13.558s
Verify outputs are the same with
$ md5sum unknowns/907_210_1_1000_s7554_l1000M{.m1,}.txt
d74bd47c8d43eac9a2309fc7722839ac unknowns/907_210_1_1000_s7554_l1000M.m1.txt
d74bd47c8d43eac9a2309fc7722839ac unknowns/907_210_1_1000_s7554_l1000M.txt
$ diff unknowns/907_210_1_1000_s7554_l1000M.m1.txt unknowns/907_210_1_1000_s7554_l1000M.txt
<empty>
$ make combined_sieve gap_stats
# Use a larger `--sieve-length` for better record probability
$ time ./combined_sieve -p 907 -d 210 --mstart 1 --minc 1000 --max-prime 1000 --save-unknowns --sieve-length 15000 -qq
gap_stats output
$ time ./combined_sieve -qq --save -u 907_210_1_1000_s15000_l1000M.txt
Testing m * 907#/210, m = 1 + [0, 1,000)
999,999,937 (primes 50,847,535/50847535) (seconds: 8.36/8.4 | per m: 0.0367)
factors 412,684 (interval: 412,684 avg m/large_prime interval: 0.4)
unknowns 123,678/228 (avg/m: 542.45) (composite: 98.19% +98.192% +6,716,550)
~ 2x 23.37 PRP/m (~ 382730.2 skipped PRP => 45793.8 PRP/seconds)
Saving unknowns to 'unknowns/907_210_1_1000_s15000_l1000M.txt'
$ time ./gap_stats --save -u 907_210_1_1000_s15000_l1000M.txt
Reading from 1_907_210_1000_s15000_l1000M.txt'
K = 1244 bits, 375 digits, log(K) = 861.69
Min Gap ~= 15511 (for merit > 18.0)
Found 3484 possible record gaps (20818 to 30158)
If found Gap: 20818 (current: 23.66) would improve to 24.159
If found Gap: 20822 (current: 24.07) would improve to 24.164
If found Gap: 21164 (current: 24.47) would improve to 24.561
prob prime : 0.0011592
prob prime coprime : 0.0140599
prob prime after sieve : 0.04278
Extended size: 27574 (32.0 merit)
Using Wheel: 210 for extended probs
Average 817 inner coprimes => 0.000898% prob_greater
Average 939 extended coprimes => 0.000161% prob_greater
Extended prob records setup (0.11 seconds)
...
228 m's processed in 0.03 seconds (7475.76/sec)
RECORD : top 1% ( 2) => sum(prob) = 3.48e-06 (avg: 1.74e-06)
RECORD : top 5% ( 11) => sum(prob) = 1.35e-05 (avg: 1.23e-06)
RECORD : top 10% ( 22) => sum(prob) = 2.29e-05 (avg: 1.04e-06)
RECORD : top 20% ( 45) => sum(prob) = 3.85e-05 (avg: 8.57e-07)
RECORD : top 50% ( 114) => sum(prob) = 6.96e-05 (avg: 6.10e-07)
RECORD : top 100% ( 228) => sum(prob) = 9.47e-05 (avg: 4.15e-07)
Saved 12339 rows to 'gap_stats' table
Saved 228 rows to 'm_stats' table
real 0m0.122s
Quick note, python gap_test.py ...
and ./gap_test_simple
should be roughly equivalent.
python gap_test.py
supports --stats
to SLOWLY double check ./gap_stats
and --plots
to print fun plots.
$ make gap_test_simple
$ time ./gap_test_simple --unknown-filename 907_210_1_1000_s15000_l1000M.txt -q
gap_test_simple output
Testing m * 907#/210, m = 1 + [0, 1,000)
Min Gap ~= 10340 (for merit > 12.0)
Reading unknowns from '907_210_1_1000_s15000_l1000M.txt'
228 tests M_start(1) + mi(0 to 996)
m=1 244 <- unknowns -> 282 1050 <- gap -> 600
tests 1 (49.43/sec) 0 seconds elapsed
unknowns 526 (avg: 526.00), 98.25% composite 46.39% <- % -> 53.61%
prp tests 39 (avg: 39.00) (1927.7 tests/sec)
best merit this interval: 1.915 (at m=1)
m=41 282 <- unknowns -> 278 336 <- gap -> 1176
tests 10 (40.57/sec) 0 seconds elapsed
unknowns 5364 (avg: 536.40), 98.21% composite 49.53% <- % -> 50.47%
prp tests 537 (avg: 53.70) (2178.5 tests/sec)
best merit this interval: 9.038 (at m=29)
10920 12.6002 143 * 907#/210 -6796 to +4124
12164 14.0190 397 * 907#/210 -9644 to +2520
m=437 272 <- unknowns -> 279 1954 <- gap -> 6
tests 100 (42.54/sec) 2 seconds elapsed
unknowns 54318 (avg: 543.18), 98.19% composite 49.75% <- % -> 50.25%
prp tests 5089 (avg: 50.89) (2165.0 tests/sec)
best merit this interval: 14.019 (at m=397)
10644 12.2646 479 * 907#/210 -8038 to +2606
m=997 257 <- unknowns -> 278 2582 <- gap -> 250
tests 228 (45.33/sec) 5 seconds elapsed
unknowns 123678 (avg: 542.45), 98.19% composite 49.95% <- % -> 50.05%
prp tests 10740 (avg: 47.11) (2135.5 tests/sec)
best merit this interval: 12.265 (at m=479)
real 0m5.275s
or slightly more quiet
$ time ./gap_test_simple --unknown-filename 907_210_1_1000_s15000_l1000M.txt -qq
Testing m * 907#/210, m = 1 + [0, 1,000)
10920 12.6002 143 * 907#/210 -6796 to +4124
12164 14.0190 397 * 907#/210 -9644 to +2520
10644 12.2646 479 * 907#/210 -8038 to +2606
m=997 257 <- unknowns -> 278 2582 <- gap -> 250
tests 228 (44.98/sec) 5 seconds elapsed
unknowns 123678 (avg: 542.45), 98.19% composite 49.95% <- % -> 50.05%
prp tests 10740 (avg: 47.11) (2118.6 tests/sec)
80%+ of the time is spent in modulo_search_euclid
so it's important to optimize this function.
See BENCHMARKS.md for more details
benchmark output
$ make benchmark
$ ./benchmark 100000
...
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/limb |
| 694 x 100000 | 503# mod <40 bit>p | 100000 | 6088 | 0.0045 | 45 | 14.0 |
| 1390 x 100000 | 1009# mod <40 bit>p | 100000 | 3292 | 0.0061 | 61 | 9.3 |
| 2799 x 100000 | 1999# mod <40 bit>p | 100000 | 8627 | 0.0086 | 86 | 6.6 |
| 7099 x 100000 | 5003# mod <40 bit>p | 100000 | 3010 | 0.0146 | 146 | 4.5 |
| 14291 x 100000 | 10007# mod <40 bit>p | 100000 | 2119 | 0.0258 | 258 | 3.9 |
| 28588 x 100000 | 20011# mod <40 bit>p | 100000 | 2854 | 0.0477 | 477 | 3.6 |
$ ./benchmark 100000 modulo_search
...
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/iter |
| 25 x 100000 | single_mod_op | 1033 | 100000 | 0.0020 | 20 | 68.8 |
| 25 x 100000 | modulo_search_brute | 100000 | 100000 | 0.0316 | 316 | 1071.8 |
| 25 x 100000 | modulo_search_euclid_small | 100000 | 100000 | 0.0074 | 74 | 249.9 |
| 25 x 100000 | modulo_search_verify | 100000 | 100000 | 0.0253 | 253 | 857.9 |
| 25 x 100000 | modulo_search_euclid | 100000 | 100000 | 0.0084 | 84 | 283.4 |
| 25 x 100000 | modulo_search_euclid_gcd | 100000 | 100000 | 0.0114 | 114 | 386.1 |
| 25 x 100000 | modulo_search_euclid_gcd2 | 100000 | 100000 | 0.0110 | 110 | 373.2 |
| 25 x 100000 | modulo_search_euclid_all_small | 86763 | 292490 | 0.0277 | 95 | 321.5 |
| 25 x 100000 | modulo_search_euclid_all_large | 86763 | 292490 | 0.0295 | 101 | 342.5 |
...
$ ./benchmark 100000 _all
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/iter |
|-----------------|--------------------------------|----------|----------|---------|---------|-------------|
| 25 x 100000 | modulo_search_euclid_all_small | 86755 | 292525 | 0.0282 | 96 | 327.1 |
| 25 x 100000 | modulo_search_euclid_all_large | 86755 | 292525 | 0.0308 | 105 | 357.5 |
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/iter |
|-----------------|--------------------------------|----------|----------|---------|---------|-------------|
| 30 x 100000 | modulo_search_euclid_all_small | 18777 | 129541 | 0.0211 | 163 | 552.7 |
| 30 x 100000 | modulo_search_euclid_all_large | 18777 | 129541 | 0.0217 | 168 | 569.0 |
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/iter |
|-----------------|--------------------------------|----------|----------|---------|---------|-------------|
| 31 x 100000 | modulo_search_euclid_all_small | 9755 | 115239 | 0.0206 | 179 | 606.3 |
| 31 x 100000 | modulo_search_euclid_all_large | 9755 | 115239 | 0.0204 | 177 | 601.4 |
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/iter |
|-----------------|--------------------------------|----------|----------|---------|---------|-------------|
| 32 x 100000 | modulo_search_euclid_all_small | 4831 | 107701 | 0.0217 | 202 | 684.2 |
| 32 x 100000 | modulo_search_euclid_all_large | 4831 | 107701 | 0.0222 | 206 | 698.6 |
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/iter |
|-----------------|--------------------------------|----------|----------|---------|---------|-------------|
| 35 x 100000 | modulo_search_euclid_all_small | 571 | 100811 | 0.0233 | 231 | 784.5 |
| 35 x 100000 | modulo_search_euclid_all_large | 571 | 100811 | 0.0238 | 236 | 799.3 |
| bits x count | method_name | found | total | time(s) | ns/iter | cycles/iter |
|-----------------|--------------------------------|----------|----------|---------|---------|-------------|
| 40 x 100000 | modulo_search_euclid_all_small | 18 | 100021 | 0.0240 | 240 | 813.5 |
| 40 x 100000 | modulo_search_euclid_all_large | 18 | 100021 | 0.0242 | 242 | 821.5 |
combined_sieve
- Takes a set of parameters (
[m\_start, m\_start + m\_inc), P#, d, sieve-length, max-prime
)- Use
combined_sieve --mstart 1 --minc 100 --max-prime 1 -d 4 -p <P>
for some insight on choose ofd
- Use
- Performs combined sieve algorithm
- Produces
<PARAMS>.txt
, list of all numbers nearm * P#/d
with no small factors.- each row is
m: -count_low, +count_high | -18 -50 -80 ... | +6 +10 +12 ...
- each row is
- Writes initial
range
entry (withtime_sieve
) intorange
table.
- Takes a set of parameters (
gap_stats
- Calculates some statistics over each range in the interval
- Expected gap in both directions
- Probability of being a record gap, of being a missing gap, of being gap >
--min-merit
- Store statistics in
m_stats
table - Compute prob(gap) over all m and save in
range_stats
table
- Calculates some statistics over each range in the interval
gap_test_simple
/gap_test.py
--unknown-filename <PARAM>.txt
- Runs PRP tests on most likely m's from
<PARAM>.txt
- Stores results (as they are calculated in
result
&m_stats
- Allows for saving of progress and graceful restart
gap_test.py
can produce graphs of distribution (via--plots
)
- Runs PRP tests on most likely m's from
Often I run combined_sieve
and gap_stats
for a BUNCH of ranges not I never end up testing.
misc/show_ranges.sh
updates and shows ranges from prime-gap-search.db
.
misc/drop_range.sh
deletes ranges when none of them have ever been tested
misc/record_check.py
searches through prime-gap-search.db
for records (over gaps.db
)
misc/finaliz.py -p <P> -d <D>
dumps m_stats
to <UNKNOWN>.csv
and <UNKNOWN>.stats.txt
files.
It also DELETES DATA from prime-gap-search.db
use it when you are done searching that P#/d
misc/run.sh <UNKNOWN_FILENAME>
runs combined_sieve
, gap_stats
, and prints the gap_test.py
command
misc/status.py
to check if a filename has been processed 100% (and can be deleted)
See BENCHMARKS.md
Theory and justification for some calculations in present in THEORY.md
In general this is going to be easy under Ubuntu 20.04 as you need Python 3.7 and sqlite3 >= 3.24 both of which are harder to install in previous versions of Ubuntu.
#$ sudo apt install libgmp10 libgmp-dev
#$ sudo apt install mercurial build-essential automake autoconf bison make libtool texinfo m4
# Building gmp from source is required until 6.3
$ hg clone https://gmplib.org/repo/gmp/
$ cd gmp
$ ./.bootstrap
$ mkdir build
$ cd build
$ ../configure
$ make
$ make check
$ make install
# Need sqlite >= 3.24 for upsert "on conflict" statement.
$ sudo apt install sqlite3
$ sudo apt install libmpfr-dev libmpc-dev libomp-dev libsqlite3-dev libbenchmark-dev
# Make sure this is >= python3.7 and not python2
$ python -m pip install --user gmpy2==2.1.0b5 primegapverify
# For misc/double_check.py
$ sudo apt install gmp-ecm
$ sudo apt install sympy
$ git clone https://github.com/sethtroisi/prime-gap.git
$ cd prime-gap
$ mkdir -p logs/ unknowns/
$ sqlite3 prime-gap-search.db < schema.sql
# There are a handful of constants (MODULE_SEARCH_SECS, PRIME_RANGE_SEC, ...)
# in gap_common.cpp `combined_sieve_method2_time_estimate` that can be set for
# more accurate `combined_sieve` timing.
- prime-gap-list
$ git clone --depth 10 https://github.com/primegap-list-project/prime-gap-list.git $ sqlite3 gaps.db < prime-gap-list/allgaps.sql
- Flow
- README.md
- THEORY.md
- Project
- --update flag for
misc/show_ranges.sh
- Finalize field
- Sum and set test-time in finalize? (IGNORE very large values which are likely pause)
- Finalize script should update ranges being finalized
- Faster finalize.py
- Finalize field
-
--save-best-only
andprocessed_range
table - can modulo_search (equation 2/3) be modified so that it only returns odd multiples?
- Check that correct indexes exist in sql db
- Consider new names for prp-top-percent, no-one-side-skip, sieve-length
- Option to only top X percent intervals after small / medium processing
- Can I sieve only the low side? and sieve high side on demand or in a 2nd pass (after gap_test)?
- --update flag for
- combined_sieve.cpp
- Better sharing work for medium_primes (or skip unknown counting)
- Unknown counting could be faster with boost or
bit_vector
- Unknown counting could be faster with boost or
- ncurse or similiar status line with all large_prime intervals & eta
- Idea that made sense was to printf then reset position to start of status (so that future prints override them)
- Error with error estimate, looking at only final invernal vs all intervals? * Estimated 3.74e+09 unknowns found 4.11e+10 (999.99% error) * Estimated 1.32e+07 new composites found 3.33e+11 (2518520.50% error)
- Better sharing work for medium_primes (or skip unknown counting)
- gap_stats.cpp
- gap_test_simple
- Do in process for very small P
- records to DB what range was tested
- gap_test_gpu
- auto optimize BATCH_SIZE, TPI, ...
- autolog to file
- gap_test.py
- Save prp-percentage finalized for faster skipping of complete files
- Validate the unknown-file matches expectations (unknowns / line, ...)
- Ctrl+C sometimes hangs waiting for one more result
- Slow for P# < 977, file read? db?
- gap_test_simple.cpp
- schema.sql
- benchmarking
- Add instructions to verify
modulo\_search
is >80% of the time. - Try out multi-threaded combined_sieve
- Add instructions to verify
- Project
- --rle could be 1/2 the size if I used coprime_x index instead of x
- Figure out how to load (in c & python) and set config occasionally
- Records / day in status.py or record_check.py
- Test if sum(prob_record) matches with --no-one-side-skip
- THEORY.md
- New script to update DB (and maybe delete uninteresting results
- Set rid if null (but reasonable range exists)
- Check if range_stats needs cleanup after deleting range?
- combined_sieve.cpp
- Record final PRP/thread-second rate
- Improve optimize D helper
- Document D helper
- Benchmark GMP_VALIDATE_LARGE_FACTORS
- gap_stats.cpp
- Produce P(record) / day / core estimate
- Check if higher prob is related to unique (mi % d)
- gap_test.py
- Test least dense side (or portion) first (See MF Prime Gap Theory #125)
- MEGAGAPS (what's remaining here?)
- Always test next_p = -1 results (regardless of prp-top-percent)
- Clean up fallback prev_prime
- Do PRP top X% in order (for more dynamic cutoff)
- Plot average tests count
- gap_common
- Load time estimate values from config file
- Compute smaller PRP and use that to computer larger (slower) PRP estimate
- benchmarking
- re-benchmark
combined_sieve
.
- re-benchmark
Tracker for finished TODO items
- README.md
- record_check.py guide to use
- Clarify gap_test.py vs gap_test.cpp
- Add Flow section based on comments in schema.sql
- Add commands for benchmarking
- Fill out gap test section
- Split out some benchmarking
- Project level
- convert unknown to start with m, not mi
- Change min-merit to 15, 18, or 20
- Run length encoding to reduce filesize
- Move UNKNOWN.txt to unknowns/
- Support --search-db everywhere
- change
ms_P_D_minc
toP_D_ms_minc_
- Make =SL included in sieve (e.g. change < SL to <= SL)
- Rename
gap_search
tocombined_sieve
- Rename prime-gap.db to prime-gap-search.db
- Make method2 the default
- config.verbose in gap_search, gap_stats, gap_test
- Make sure that next_p = 0, = -1, is handled correctly in places.
- combined_sieve.cpp
-
M_inc * max_prime > 2^64
usemodulo_search_euclid_all_large
- Improve error messages (gap_stats missing, SL size)
- Calculating coprime [L, R] * K^-1 mod p for medium p
- Wheel for coprime i (saves 50-80% of space)
- Add some theory for only doing one side test.
- Optimize D helper
- Integrate faster prime iterator (libprimesieve)
- ETA for
combined_sieve
timing. - Test number to be marked composite see if already marked by 2 / 3 and skip (e.g. try to avoid memory lookup)
- "expect %ld left" doesn't match
- Write time_sieve into DB
- Method2 (all composites in memory)
-
vector<char>
doesn't show improvement overvector<bool>
- Verify skipped PRP by testing at X and 2*X
- Estimate PRP/s and include in status.
- Ctrl-C to quit early (but writes out the results at that point).
- Look at Method1 vs Method2 writeup and understand why outputs seem different
- Use reindex_m trick to reduce number size of composites
- Do all primes for small ms (to get better memory access patterns)
- check GCD with P# to avoid writing to main memory. (didn't work)
- Printing at each 1B primes tells you PRP tests saved / time taken.
- Write up of Method1 vs Method2 (from memory M2 is faster but uses more memory)
-
- Method1
- Store remainder and prime in same array
- don't store pi for large primes (just pass around the pair)
- Verify
sieve_length
math with d > 1- Calculate
sieve_length
for all (m % d)
- Calculate
- Don't save to large_prime_queue[next_mi] with (next_mi, d) > 1
- Only store prime/remainder for primes that divide ANY mi.
-
max_prime
> 4B - Dynamic
sieve_length
- Dynamic
max_prime
-
- gap_stats.cpp
- Accumulate missing / skipped gap_prob at 0
- P(record) is missing double extended
- Benchmark reindex_m_wheel @ 6, 30, 210
- Print (1 - seen) / prob as "unknown"
- Wheel for extended range
- Write all data that
gap_test.py
consumes to DB - Tweak logging at different verbose levels
- Move missing gaps behind a compile flag
- drop directory from
unknown-filename
- Missing gaps prob % filter
- Don't rewrite
partial\_results
- Calc record chance
- Save expected/prob record to sql
- load merit from gap.db
- Load from unknown_fn
- gap_test.py
- benchmark not calling commit() on every save result
- Load min-merit from db
- Read min-merit from DB in gap_test.py
- Reduce memory if not --stats / --num-plots
- First Ctrl+C stops new results, 2nd breaks.
- Use primegapverify.sieve for better prev_prime
- Dynamic one sided test threshold and logging.
- Extended gap for one sided tests?
-
--top-x-percent
(see THEORY.md) - Correlation between Expected Gap & Gap
- Show P(record) / day
- Multi-threaded
- Update
m_stats
(withtest_time
), - Plot P(gap > min_merit) and P(record) sorted and unsorted.
- Option to toggle between OpenPFGW and gmp (see note in benchmarking below)
- Store all results to sql
- Autoscale printing to every X seconds
- Describe distribution
- Generate expected length
- gap_test_gpu
- auto select BITS, TPI
- prev side only (side skip)
- checkpoint/resume logic
- benchmarking
- Experimentally compute
prime_time_estimate
- Redo prime-time with user time (gmp better)
- There's serious overhead using subprocess, and I see 150% CPU usage.
- I Tried passing via stdin (pfgw supports -- but quits after one) but couldn't make it work.
- MF suggest you can use gwnum library and hack some api but seems like a like of work for ~20-200% gain on reasonable primes.
- Add int64
modulo_search_euclid_all
- Add benchmarks for int32/int64
modulo_search
- Add benchmarks for
K % p
- Experimentally compute
- prime-gap-search.db
- A plan for how to clean up [partially] finished ranges
- misc/
- tests.sh
- misc/tests.sh breaks because records improve (so chance goes down) used fixed gaps.db
- Finalize script
- dump to csv, dump stats
- show_ranges.sh
- Update range.
num_processed / num_to_process
- Verify every table.
m_stat
result in table.result
- Update range.
- record_check.py
- Read from sql db
- double_check.py
- run ecm on random unknowns and verify factors found > sieve limit
- tests.sh