This project is a parser, validator and normalizer implementation for shorthand lipid nomenclatures, base on the Grammar of Succinct Lipid Nomenclatures project.
Goslin defines multiple grammers compatible with ANTLRv4 for different sources of shorthand lipid nomenclature. This allows to generate parsers based on the defined grammars, which provide immediate feedback whether a processed lipid shorthand notation string is compliant with a particular grammar, or not.
jGoslin uses the Goslin grammars and the generated parser to support the following general tasks:
-
Facilitate the parsing of shorthand lipid names dialects.
-
Provide a structural representation of the shorthand lipid after parsing.
-
Use the structural representation to generate normalized names.
The Maven site with JavaDoc is available here.
The web-based application is available here.
- Related Projects
- Table of contents
- Building the project and generating client code from the command-line
- Running a validation with the command-line interface
- Running the Web Application for Validation
- Building the Docker Image
- Running the Docker Image
- Using the project code releases from Maven Central
- Using the project code releases via JFrog
- Using the API programmatically
- References
JGoslin uses the latest Long Term Support Java release version (17). In order to build the client code and run the unit tests, execute the following command from a terminal:
./mvnw install
This generates the necessary domain specific code for Java.
The cli
sub-project provides a command line interface for parsing of lipid names either from the command line or from a file with one lipid name per line.
After building the project as mentioned above with ./mvnw install
, the cli/target
folder will contain the jgoslin-cli-<version>-bin.zip
file. Alternatively, you can download the latest
cli zip file from JFrog: Search for latest jgoslin-cli-<VERSION>-bin.zip artefact and click to download.
In order to run the validator, unzip that file, change into the unzipped folder and run
java -jar jgoslin-cli-<VERSION>.jar
to see the available options.
To parse a single lipid name from the command line, run
java -jar jgoslin-cli-<VERSION>.jar -n "Cer(d31:1/20:1)"
To parse multiple lipid names from a file via the commmand line, run
java -jar jgoslin-cli-<VERSION>.jar -f examples/lipidnames.txt
To use a specific grammar, instead of trying all, run
java -jar jgoslin-cli-<VERSION>.jar -f examples/lipidNames.txt -g GOSLIN
To write output to the tab-separated output file 'goslin-out.tsv', run
java -jar jgoslin-cli-<VERSION>.jar -f examples/lipidNames.txt -o
The goslin web application is available at: https://apps.lifs-tools.org/goslin
In order to build a Docker image of the command line interface application, run
./mvnw -Pdocker install
from your commandline (mvnw.bat on Windows). This will build and tag a Docker image lifs/jgoslin-cli with a corresponding version and make it available to your local Docker installation. To show the coordinates of the image, call
docker image ls | grep "jgoslin-cli"
If you have not done so, please build the Docker image of the validator cli or pull it from the docker hub (see previous sections).
Then, run the following command, replacing <VERSION>
with the current version, e.g. 1.0.0
) and <DATA_DIR>
with the local directory containing your lipid name files:
docker run -v <YOUR_DATA_DIR>:/home/data:rw lifs/jgoslin-cli:<VERSION>
This will only invoke the default entrypoint of the container, which is a shell script wrapper calling the jgoslin-cli Jar. It passes all arguments to the validator, so that all
arguments that you would pass normally will work in the same way (please replace <YOUR_FILE>
with the actual file’s name in <YOUR_DATA_DIR>
:
docker run -v <YOUR_DATA_DIR>:/home/data:rw lifs/jgoslin-cli:<VERSION> -f <YOUR_FILE>
You can also run the docker container without the -f <YOUR_FILE>
argument to see a list of possible arguments.
jgoslin is available from the Maven central repository:
To use the parser libraries (reading and validation) in your own Maven projects, use the following dependency:
<dependency> <groupId>org.lifs-tools</groupId> <artifactId>jgoslin-parsers</artifactId> <version>${jgoslin.version}</version> </dependency>
where jgoslin.version is the version of jgoslin you wish to use, e.g. for a release version:
<properties> <jgoslin.version>2.0.0</jgoslin.version> </properties>
as defined in the properties section of your pom file.
The library release artifacts are available from JFrog. If you want to use them, add the following lines to your own Maven pom file :
<profile> <id>lifs-repos</id> <repositories> <repository> <snapshots> <enabled>false</enabled> </snapshots> <id>lifs-libs-release</id> <name>lifs-libs-release</name> <url>https://lifstools.jfrog.io/artifactory/lifs-libs-release</url> </repository> </repositories> </profile>
To compile jgoslin against the LIFS JFrog repository, please add the following entry to you ~/.m2/settings.xml file:
<activeProfiles> <activeProfile>lifs-repos</activeProfile> </activeProfiles>
or use the -Plifs-repos
command line switch when running Maven to enable the LIFS JFrog maven repositories for parent pom and artifact resolution.
To use the parser libraries (reading and validation) in your own Maven projects, use the following dependency:
<dependency> <groupId>org.lifs-tools</groupId> <artifactId>jgoslin-parsers</artifactId> <version>${jgoslin.version}</version> </dependency>
where jgoslin.version
is the version of jgoslin you wish to use, e.g. for a release version:
<properties> <jgoslin.version>2.0.0</jgoslin.version> </properties>
as defined in the properties section of your pom file.
The following snippet shows how to parse a shorthand lipid name with the different parsers:
import org.lifstools.jgoslin.domain.*; // contains Domain objects like LipidAdduct, LipidSpecies ... import org.lifstools.jgoslin.parser.*; // contains the parser implementations ...
String ref = "Cer(d18:1/20:2)"; try { // use the SwissLipids parser SwissLipidsParser slParser = new SwissLipidsParser(); // multiple eventhandlers can be used with one parser, e.g. in parallel processing SwissLipidsParserEventHandler slHandler = slParser.newEventHandler(); LipidAdduct sllipid = slParser.parse(ref, slHandler); System.out.println(sllipid.getLipidString()); // to print the lipid name at its native level to the console } catch (LipidException ptve) { // catch this for any parsing or semantic issues with a lipid ptve.printStackTrace(); }
//alternatively, use the other parsers. Don't forget to place try catch blocks around the following lines, as for the SwissLipids parser example // use the LipidMAPS parser LipidMapsParser lmParser = new LipidMapsParser(); LipidMapsParserEventHandler lmHandler = lmParser.newEventHandler(); LipidAdduct lmlipid = lmParser.parse(ref, lmHandler); // use the shorthand notation parser GOSLIN GoslinParser goslinParser = new GoslinParser(); GoslinParserEventHandler goslinHandler = goslinParser.newEventHandler(); LipidAdduct golipid = goslinParser.parse(ref, goslinHandler); // use the updated shorthand notation of 2020 ShorthandParser shorthandParser = new ShorthandParser(); ShorthandParserEventHandler shorthandHandler = shorthandParser.newEventHandler(); // calling parse with the optional argument false suppresses any exceptions, if errors are encountered, the returned LipidAdduct will be null LipidAdduct shlipid = shorthandParser.parse(ref, shorthandHandler, false);
To retrieve a parsed lipid name on a higher hierarchy of lipid level, simply define the level when requesting the lipid name:
System.out.println(sllipid.getLipidString(LipidLevel.CATEGORY)); System.out.println(sllipid.getLipidString(LipidLevel.CLASS)); System.out.println(sllipid.getLipidString(LipidLevel.SPECIES)); System.out.println(sllipid.getLipidString(LipidLevel.MOLECULAR_SPECIES)); System.out.println(sllipid.getLipidString(LipidLevel.SN_POSITION)); System.out.println(sllipid.getLipidString(LipidLevel.STRUCTURE_DEFINED)); System.out.println(sllipid.getLipidString(LipidLevel.FULL_STRUCTURE)); System.out.println(sllipid.getLipidString(LipidLevel.COMPLETE_STRUCTURE));
This functionality allows easy computation of aggregate statistics of lipids
on lipid class, category or arbitrary levels. Requesting a lipid name on a lower level than the
provided will raise a org.lifstools.jgoslin.domain.ConstraintViolationException
.
For more examples how the API works, please consult the tests, especially in the parsers
module.
This project is the Java implementation for Goslin. If you use Goslin or any of the specific implementations in your work, we kindly ask you to cite the original publication:
-
[D. Kopczynski et al., Goslin - A Grammar of Succinct Lipid Nomenclature, Analytical Chemistry, June 26th, 2020.](https://pubs.acs.org/doi/10.1021/acs.analchem.0c01690) [doi:10.1021/acs.analchem.0c01690](https://doi.org/10.1021/acs.analchem.0c01690)
If you are using any of the new features of Goslin 2.0, please cite the following, updated Goslin 2.0 publication:
-
[D. Kopczynski et al., Goslin 2.0 Implements the Recent Lipid Shorthand Nomenclature for MS-Derived Lipid Structures, Analytical Chemistry, April 11th, 2022.](https://pubs.acs.org/doi/full/10.1021/acs.analchem.1c05430) [doi:10.1021/acs.analchem.1c05430](https://doi.org/10.1021/acs.analchem.1c05430)