Easier Way to Use Custom LD? #204

ttbek · 2023-01-02T10:15:28Z

If I'm reading the code correctly, in order to use my own LD I would need to actually change the source passed to locuszoom.js, find the calls used while populating in locuszoom.js, and then set up our own restful API server to give the json response?

Does the following match the call format used in locuszoom.js? That is, does it use the region call like that?
https://portaldev.sph.umich.edu/ld/genome_builds/GRCh37/references/1000G/populations/AFR/regions?correlation=r&chrom=X&start=67544032&stop=67544350

Is there an easier way to load from a local precalculated LD file?

Maybe customize the locuszoom.js (means loading the changed version from our server instead of the currently set source) by changing their populate function?

I'm a bit unsure what the best approach would be here. Preferably we would also still be able to load the 1000 genomes LD values but we want to also display some custom ones.

abought · 2023-01-02T14:50:39Z

Thanks for your question. First, the fancy answer to your question: if you're comfortable running the infrastructure, then the code to calculate custom LD is open source and you can run a compatible API locally: https://github.com/statgen/LDServer/ <https://github.com/statgen/LDServer/> And you can experiment with API query syntax for that LD server here: https://portaldev.sph.umich.edu/playground <https://portaldev.sph.umich.edu/playground> If you only want to use precalculated LD from a file: it is possible to load from pre-calculated LD, but it's kind of unwieldy. The main issue is that pre-calculated LD files can be extremely large to store and query; you can try to reduce the size by only including LD relative to a preset list of lead variants.... but for phewas-scale data, you might have a lot of lead variants (=bigger files). The newest version of Locuszoom.js includes some features designed to help read such LD files (it's what we use for LocalZoom & my.locuszoom.org <http://my.locuszoom.org/>). But PheWeb uses an older version of Locuszoom (see a rough draft of what PheWeb code would need to change <#185> to work with LocusZoom.js 0.14.0, which added the "use LD from local file" helper code). There's no strong technical reason why PheWeb isn't updated for LZ.js 14, except that the original developer of PheWeb moved on and the task got lost in the shuffle. Some LZ.js demos show how you would modify the plot creation code to specify LD from a local file. We use tabix to make the queries (slightly) more manageable, but custom LD can still be a very big file. PLINK can be kind of slow calculating that much LD the first time, but the demo shows expected file format so you can substitute other tools of your choosing. https://statgen.github.io/locuszoom/examples/ext/tabix_tracks.html <https://statgen.github.io/locuszoom/examples/ext/tabix_tracks.html> https://github.com/statgen/locuszoom/blob/develop/examples/ext/tabix_tracks.html#L203-L207 <https://github.com/statgen/locuszoom/blob/develop/examples/ext/tabix_tracks.html#L203-L207> Anyway, I hope this helps! We always wanted to have more LD options, but in a reusable internet tool, there aren't a lot of good LD panels that people are allowed to share publicly. Every now and then I ask, in hopes that something has changed. :)

…

-Andy Boughton ***@***.*** Applications Programmer/Analyst, Lead Center for Statistical Genetics University of Michigan

On Jan 2, 2023, at 5:15 AM, ttbek ***@***.***> wrote: If I'm reading the code correctly, in order to use my own LD I would need to actually change the source passed to locuszoom.js, find the calls used while populating in locuszoom.js, and then set up our own restful API server to give the json response? Does the following match the call format used in locuszoom.js? That is, does it use the region call like that? https://portaldev.sph.umich.edu/ld/genome_builds/GRCh37/references/1000G/populations/AFR/regions?correlation=r&chrom=X&start=67544032&stop=67544350 <https://portaldev.sph.umich.edu/ld/genome_builds/GRCh37/references/1000G/populations/AFR/regions?correlation=r&chrom=X&start=67544032&stop=67544350> Is there an easier way to load from a local precalculated LD file? Maybe customize the locuszoom.js (means loading the changed version from our server instead of the currently set source) by changing their populate function? I'm a bit unsure what the best approach would be here. Preferably we would also still be able to load the 1000 genomes LD values but we want to also display some custom ones. — Reply to this email directly, view it on GitHub <#204>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWR6EITXS36NOXSXPOLRYTWQKTEXANCNFSM6AAAAAATOV6XMA>. You are receiving this because you are subscribed to this thread.

ttbek · 2023-02-27T16:27:41Z

Sorry for taking so long to get back to this. I'm attempting to setup the server and I'm getting some output from the Raremetal one, but the ld server is just giving: "The connection was reset"

Output is looking like this:

sudo docker-compose down && sudo docker-compose up
Removing ldserver_ldserver_1 ... done
Removing ldserver_raremetal_1 ... done
Removing ldserver_redis_1 ... done
Creating ldserver_redis_1 ... done
Creating network "ldserver_default" with the default driver
Creating ldserver_redis_1 ...
Creating ldserver_ldserver_1 ... done
Creating ldserver_raremetal_1 ... done
Attaching to ldserver_redis_1, ldserver_raremetal_1, ldserver_ldserver_1
ldserver_1 | Running startup flask add commands...
raremetal_1 | [2023-02-27 15:42:11,078] INFO in model: Added genotype file: var/ALL.chr22.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.bcf
raremetal_1 | [2023-02-27 15:42:11 +0000] [1] [INFO] Starting gunicorn 20.1.0
raremetal_1 | [2023-02-27 15:42:11 +0000] [1] [INFO] Listening at: http://0.0.0.0:4545 (1)
raremetal_1 | [2023-02-27 15:42:11 +0000] [1] [INFO] Using worker: gthread
raremetal_1 | [2023-02-27 15:42:11 +0000] [12] [INFO] Booting worker with pid: 12
raremetal_1 | [2023-02-27 15:42:12 +0000] [13] [INFO] Booting worker with pid: 13

So the only think coming from the LD container is that it is running the startup commands. I'm not seeing anything problematic in the gunicorn or Redis logs (I think at least). Do we expect the startup commands to take a long time? That is, am I just being too impatient and I'll probably get different output when it is ready, or is there probably a problem?

ttbek · 2023-02-28T09:16:08Z

Is gunicorn supposed to be running on 8000? The .env I have says 4546 and docker is mapping 4546, but the gunicorn log has this:

[2023-02-27 15:42:24 +0000] [1] [INFO] Listening at: http://0.0.0.0:8000 (1)
[2023-02-27 15:42:24 +0000] [1] [INFO] Using worker: gevent
[2023-02-27 15:42:24 +0000] [32] [INFO] Booting worker with pid: 32
[2023-02-27 15:42:24 +0000] [33] [INFO] Booting worker with pid: 33

My .env file:

LDSERVER_PORT=4546
LDSERVER_CONFIG_SCRIPT=/home/ldserver/startup.sh
LDSERVER_WORKERS=2
RAREMETAL_CONFIG_DATA=var/config.yaml
RAREMETAL_WORKERS=2
RAREMETAL_PORT=4545
OMP_NUM_THREADS=2
OPENBLAS_NUM_THREADS=2

ttbek · 2023-02-28T11:11:22Z

Ah... the example docker override file wasn't putting the port even though they showed modifying the command, so changing from this:

  gunicorn -b 0.0.0.0 -w $$LDSERVER_WORKERS -k gevent \
    --access-logfile /data/logs/gunicorn.access.log \
    --error-logfile /data/logs/gunicorn.error.log \
    --pythonpath rest 'ldserver:create_app()'"

to this:

  gunicorn -b 0.0.0.0:4546 -w $$LDSERVER_WORKERS -k gevent \
    --access-logfile /data/logs/gunicorn.access.log \
    --error-logfile /data/logs/gunicorn.error.log \
    --pythonpath rest 'ldserver:create_app()'"

Allows me to reach the endpoints http://localhost:8084/correlations (8084 is the locally mapped ssh forwarded port, it is localhost:4546 on the server side) and http://localhost:8084/genome_builds with the expected results. However, something like http://localhost:8084/genome_builds/GRCh37/references/1000G/populations/AFR/regions?correlation=r&chrom=20&start=60343&stop=65000 gives me "Internal Server Error" and looking in the gunicorn log on the server shows:

[2023-02-28 10:07:15,376] ERROR in app: Exception on /genome_builds/GRCh37/references/1000G/populations/AFR/regions [GET]
Traceback (most recent call last):
File "/home/ldserver/.local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/home/ldserver/.local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/ldserver/.local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/ldserver/.local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/home/ldserver/.local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/home/ldserver/.local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functionsrule.endpoint
File "/home/ldserver/rest/ldserver/api.py", line 165, in get_region_ld
ldserver.compute_region_ld(str(args['chrom']), args['start'], args['stop'], correlation_type(args['correlation']), result, str(population_name))
RuntimeError: Error while reading a cell from Redis cache

The Redis log shows:

1:C 28 Feb 2023 10:19:52.343 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 28 Feb 2023 10:19:52.343 # Redis version=5.0.14, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 28 Feb 2023 10:19:52.343 # Configuration loaded
1:M 28 Feb 2023 10:19:52.346 * Running mode=standalone, port=6379.
1:M 28 Feb 2023 10:19:52.346 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 28 Feb 2023 10:19:52.346 # Server initialized
1:M 28 Feb 2023 10:19:52.347 * DB loaded from disk: 0.000 seconds
1:M 28 Feb 2023 10:19:52.347 * Ready to accept connections

I did take a shot at fixing that warning earlier and currently if I:

cat /proc/sys/net/core/somaxconn
512

Well, it looks fixed there. Maybe this parameter needs to be done inside the container? I thought it was a kernel parameter and would be outside though. It was originally 128 as the message suggests, but it has been changed to 512 and Redis has been restarted several times since then. I don't think that would be the issue, but it's all the Redis log is complaining about.

ttbek · 2023-02-28T11:49:47Z

Turns out the container may be more restricted than the kernel value, but it can be set in the dockerfile, so I added:

sysctls:
  net.core.somaxconn: 512

To the section for the alpine Redis image. It fixes that last warning in the Redis log... but no dice on the error in the gunicorn log, still get it.

ttbek · 2023-02-28T12:39:48Z

Ah, ok, ok,

warning For docker, you must change CACHE_REDIS_HOSTNAME to redis.

For some reason I read this as changing the left side, to redis, that is the text 'CACHE_REDIS_HOSTNAME' to the text 'redis'. I know, makes no sense. Fixed now. So I guess the next step is that I need to move my Pheweb over to the production server and... point it to this new LD server somehow.

Regarding available LD Panels, true indeed. Unfortunately we aren't putting out a new data set with this. The population in our Pheweb is Arab and not well represented in 1KG, so even though it is small we wanted to also show the LD from the 108 Qatari genomes published here: https://genome.cshlp.org/content/26/2/151.full.html They're publicly available on the Sequence Read Archive, just need to use the toolkit to download them due to the size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Easier Way to Use Custom LD? #204

Easier Way to Use Custom LD? #204

ttbek commented Jan 2, 2023

abought commented Jan 2, 2023 via email

ttbek commented Feb 27, 2023

ttbek commented Feb 28, 2023

ttbek commented Feb 28, 2023

ttbek commented Feb 28, 2023

ttbek commented Feb 28, 2023

Easier Way to Use Custom LD? #204

Easier Way to Use Custom LD? #204

Comments

ttbek commented Jan 2, 2023

abought commented Jan 2, 2023 via email

ttbek commented Feb 27, 2023

ttbek commented Feb 28, 2023

ttbek commented Feb 28, 2023

ttbek commented Feb 28, 2023

ttbek commented Feb 28, 2023