Tile-Low Rank Multi-Dimensional Convolution: fast MDC modelling and inversion for seismic applications
Tile-Low Rank Multi-Dimensional Convolution (TLR-MDC) represents one of the most expensive operators in seismic processing. This is especially the case for large 3D seismic datasets, as it requires access to the entire data in the form of a 3D tensor (freq-sources-recs) for the application of a batched complex-valued matrix-vector multiplication operation.
This repository contains an extemely efficient implementation of MDC by leveraging data sparsity encountered in seismic frequency-domain data. More precisely, each seismic frequency slice is first tiled, compressed by means of singular value decomposition and its bases are stored on disk (TLR compression). Subsequently, the MDC operator is applied by operating directly with the compressed bases using a custom-made TLR-MVM kernel.
This repository provides Python codes to run a variety of seismic algorithms leveraging the C++/CUDA implementation of TLR-MVM situated in this repository.
This repository is organized as follows:
- mdctlr: python library containing routines for tlr-based multidimensional convolution and its use in seismic inverse problem;
- app: set of python scripts running the different applications provided by the mdctlr library;
- script: set of shell scripts used to run the various applications on standard workstations;
- script_hpc: set of shell scripts used to run the various applications on KAUST supercomputer;
To get started, you must first install the tlrmvm
library. Follow the instruction provided in the official GitHub repository of the library.
After that, you will need to create a python environment with the required dependencies to run codes in this repository. We suggest using conda and create the environment with the following command:
Simply run:
conda env create -f mdctlr.yml
After that you can simply install the mdctlr
package via:
pip install .
or in developer mode:
pip install -e .
We also have an installation video on Youtube - OUTDATED!
Available applications:
- GenerateDataset
- MDC
- Marchenko
- MDD
- MDDOve3DFull
To create a TLR compressed version of a dataset:
python app/GenerateDataset.py --help
will give you
usage: GenerateDataset.py [-h] [--nb NB] [--error_threshold ERROR_THRESHOLD] [--reordering REORDERING] [--freqlist FREQLIST] [--rankmodule RANKMODULE] [--nrx NRX] [--nry NRY] [--foldername FOLDERNAME] [--prefix PREFIX] [--suffix SUFFIX] [--format FORMAT] [--matname MATNAME]
optional arguments:
-h, --help show this help message and exit
--nb NB nb
--error_threshold ERROR_THRESHOLD
error threshold
--reordering REORDERING
geometry reordering type: hilbert, normal
--freqlist FREQLIST processing freqlist
--rankmodule RANKMODULE
all rank in the matrix dividable by certain value
--nrx NRX number of receivers along the x axis
--nry NRY number of receivers along the y axis
--foldername FOLDERNAME
foldername where input data are stored
--prefix PREFIX prefix of filenames
--suffix SUFFIX suffix of filenames
--format FORMAT format of file (mat or zarr)
--matname MATNAME name of variable in matfile
An example run:
python app/GenerateDataset.py \
--nb=256 --error_threshold=0.001 --reordering=normal --rankmodule=8 \
--freqlist=0,1,
To run a single instance of the MDC operator
python app/MDC.py --help
will give you
usage: MDC.py [-h] [--AuxFile AUXFILE] [--MVMType MVMTYPE] [--TLRType TLRTYPE] [--bandlen BANDLEN] [--nfmax NFMAX] [--wavfreq WAVFREQ] [--OrderType ORDERTYPE] [--ModeValue MODEVALUE] [--M M] [--N N] [--nb NB] [--threshold THRESHOLD] [--repeat REPEAT] [--debug]
3D Multi-Dimensional Convolution with TLR-MDC and matrix reordering
optional arguments:
-h, --help show this help message and exit
--AuxFile AUXFILE File with Auxiliary information for MDC
--MVMType MVMTYPE Type of MVM: Dense, TLR
--TLRType TLRTYPE TLR Precision: fp32, fp16, fp16int8, int8
--bandlen BANDLEN TLR Band length
--nfmax NFMAX TLR Number of frequencies
--wavfreq WAVFREQ Ricker wavelet peak frequency used to convolve the input
--OrderType ORDERTYPE
Matrix reordering method: normal, l1, hilbert
--ModeValue MODEVALUE
Rank mode
--M M Number of sources/rows in seismic frequency data
--N N Number of receivers/columns in seismic frequency data
--nb NB TLR Tile size
--threshold THRESHOLD
TLR Error threshold
--repeat REPEAT Number of repeated MDC computation for statistics
--debug Debug
An example run:
mpirun -np 4 python app/MDC.py --AuxFile 3DMarchenko_aux.npz --MVMType TLR --TLRType fp16 \
--ModeValue 8 --OrderType normal --repeat 10 --debug
The Auxiliary .npz file must contain the following variables:
- 📇
x
: x-axis of subsurface model - 📇
y
: y-axis of subsurface model - 📇
z
: z-axis of subsurface model - 📇
t
: t-axis of data - 📇
srcs
: sources coordinates (array of shape nsrc x 3 ordered as x,y,z) - 📇
recs
: receivers coordinates (array of shape nsrc x 3 ordered as x,y,z) - 📇
vs
: chosen virtual source (array of shape 3 x 1 ordered as x,y,z) - 📇
G0
: reference wavefield to compare (array of shape nt x nrecs ordered as x,y,z)
To run Marchenko redatuming by inversion for a single virtual point
python app/MarchenkoRedatuming.py --help
will give you
usage: MarchenkoRedatuming.py [-h] [--AuxFile AUXFILE] [--MVMType MVMTYPE] [--TLRType TLRTYPE] [--bandlen BANDLEN] [--nfmax NFMAX] [--OrderType ORDERTYPE] [--ModeValue MODEVALUE] [--M M] [--N N] [--nb NB] [--threshold THRESHOLD] [--debug]
3D Marchenko Redatuming with TLR-MDC and matrix reordering
optional arguments:
-h, --help show this help message and exit
--AuxFile AUXFILE File with Auxiliar information for Mck redatuming
--MVMType MVMTYPE Type of MVM: Dense, TLR
--TLRType TLRTYPE TLR Precision: fp32, fp16, int8
--bandlen BANDLEN TLR Band length
--nfmax NFMAX TLR Number of frequencies
--OrderType ORDERTYPE
Matrix reordering method: normal, l1, hilbert
--ModeValue MODEVALUE
Rank mode
--M M Number of sources/rows in seismic frequency data
--N N Number of receivers/columns in seismic frequency data
--nb NB TLR Tile size
--threshold THRESHOLD
TLR Error threshold
--debug Debug
An example run:
mpirun -np 4 python app/MarchenkoRedatuming.py --AuxFile 3DMarchenko_aux.npz --MVMType TLR --TLRType fp16 \
--ModeValue 8 --OrderType normal --debug
The Auxiliary .npz file must contain the same variables of the MDC app, plus:
- 📇
G
: full wavefield to compare (array of shape nt x nrecs ordered as x,y,z)
To run MDD of Marchenko redatumed fields for a single virtual point
python app/MDD.py --help
will give you
usage: MDD.py [-h] [--AuxFile AUXFILE] [--MVMType MVMTYPE] [--TLRType TLRTYPE] [--bandlen BANDLEN] [--nfmax NFMAX] [--OrderType ORDERTYPE] [--ModeValue MODEVALUE] [--M M] [--N N] [--ivsinv IVSINV] [--nb NB] [--threshold THRESHOLD] [--debug]
3D Multi-Dimensional Deconvolution with TLR-MDC and matrix reordering
optional arguments:
-h, --help show this help message and exit
--AuxFile AUXFILE File with Auxiliar information for Mck redatuming
--MVMType MVMTYPE Type of MVM: Dense, TLR
--TLRType TLRTYPE TLR Precision: fp32, fp16, fp16int8, int8
--bandlen BANDLEN TLR Band length
--nfmax NFMAX TLR Number of frequencies
--OrderType ORDERTYPE
Matrix reordering method: normal, l1, hilbert
--ModeValue MODEVALUE
Rank mode
--M M Number of sources/rows in seismic frequency data
--N N Number of receivers/columns in seismic frequency data
--ivsinv IVSINV Index of virtual source to invert for
--nb NB TLR Tile size
--threshold THRESHOLD
TLR Error threshold
--debug Debug
An example run:
mpirun -np 4 python $TLRMDCROOT/app/MDD.py --AuxFile 3DMDD_aux.npz --M 9801 --N 2911 --MVMType TLR --TLRType fp16 \
--nb 128 --ModeValue 4 --OrderType normal --nfmax 150 --ivsinv 880 --debug
The Auxiliary .npz file must contain the following variables:
- 📇
srcs
: sources coordinates (array of shape nsrc x 3 ordered as x,y,z) - 📇
vsz
: depth of array of grid of virtual receveirs (scalar) - 📇
vsx
: x-axis of grid of virtual receivers - 📇
vsy
: y-axis of grid of virtual receivers - 📇
t
: t-axis of data
To run MDD of Up/Down separated fields for a single virtual point (currently implemented for Overthrust model, but code should be generic and applicable to any other data provided it is organized according to our requirements - see below)
python app/MDDOve3DFull.py --help
will give you
usage: MDDOve3DFull.py [-h] [--AuxFile AUXFILE] [--DataFolder DATAFOLDER] [--PupFolder PUPFOLDER] [--FigFolder FIGFOLDER] [--MVMType MVMTYPE] [--TLRType TLRTYPE] [--bandlen BANDLEN] [--nfmax NFMAX] [--OrderType ORDERTYPE] [--PHilbertSrc PHILBERTSRC] [--PHilbertRec PHILBERTREC] [--ModeValue MODEVALUE] [--M M] [--N N] [--nb NB] [--threshold THRESHOLD] [--vs VS] [--niter NITER]
[--damp DAMP] [--debug]
3D Multi-Dimensional Deconvolution with TLR-MDC and matrix reordering
optional arguments:
-h, --help show this help message and exit
--AuxFile AUXFILE File with Auxiliary information for MDD
--DataFolder DATAFOLDER
Folder containing compressed data
--PupFolder PUPFOLDER
Folder containing pup data
--FigFolder FIGFOLDER
Prefix of folder containing figures
--MVMType MVMTYPE Type of MVM: Dense, TLR
--TLRType TLRTYPE TLR Precision: fp32, fp16, fp16int8, int8
--bandlen BANDLEN TLR Band length
--nfmax NFMAX TLR Number of frequencies
--OrderType ORDERTYPE
Matrix reordering method: normal, l1, hilbert
--PHilbertSrc PHILBERTSRC
Hilbert size of source axis (2^p)
--PHilbertRec PHILBERTREC
Hilbert size of receiver axis (2^p)
--ModeValue MODEVALUE
Rank mode
--M M Number of sources/rows in seismic frequency data
--N N Number of receivers/columns in seismic frequency data
--nb NB TLR Tile size
--threshold THRESHOLD
TLR Error threshold
--vs VS Virtual source
--niter NITER Number of iterations of MDD
--damp DAMP Damping of MDD
--debug Debug
An example run:
mpirun -np 4 python $TLRMDCROOT/app/MDDOve3DFull.py --AuxFile MDDOve3D_aux.npz --DataFolder compresseddata_full \
--M 26040 --N 15930 --MVMType TLR --TLRType fp32 \
--nb 256 --threshold 0.001 --ModeValue 8 --OrderType hilbert --PHilbertSrc 12 --PHilbertRec 12 --nfmax 200 --vs 9115 --debug
The Auxiliary .npz file must contain the following variables:
- 📇
ny
: number of samples of y-axis of subsurface model - 📇
nx
: number of samples of x-axis of subsurface model - 📇
nz
: number of samples of z-axis of subsurface model - 📇
dy
: sampling of y-axis of subsurface model - 📇
dx
: sampling of x-axis of subsurface model - 📇
dz
: sampling of z-axis of subsurface model - 📇
osy
: origin of source grid over y-axis (the same gap between the origin of the reference system and the origin of the source array, will also be added at the end of the grid) - 📇
dsy
: sampling of source grid over y-axis - 📇
osx
: origin of source grid over x-axis - 📇
dsx
: sampling of source grid over x-axis - 📇
ory
: origin of receiver grid over y-axis - 📇
dry
: sampling of receiver grid over y-axis - 📇
nry
: number of samples of receiver grid over y-axis - 📇
orx
: origin of receiver grid over x-axis - 📇
drx
: sampling of receiver grid over x-axis - 📇
nrx
: number of samples of receiver grid over x-axis - 📇
t
: t-axis of data
The codes are based on two different datasets:
- The first one, used to run the MDC, MarchenkoRedatuming, and MDD apps can be downloaded at https://zenodo.org/record/6582600#.Yo-nhJPMKwl. More details about this dataset can be found in the following publication:
@article{ravasi2022,
title={An open-source framework for the implementation of large-scale integral operators with flexible, modern high-performance computing solutions:
Enabling 3D Marchenko imaging by least-squares inversion},
authors={M. Ravasi, I. Vasconcelos},
journal={Geophysics},
year={2021}
}
- The second one, used to run the MDDOve3DFull app is currently not available due to data size, contact us if interested in the dataset.