Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document formal script syntax #5336

Merged
merged 29 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
9d6c826
Update docs and tests with strict syntax
bentsherman Sep 25, 2024
189b560
Remove some references to Groovy
bentsherman Sep 26, 2024
24bb403
Rename "implicit workflow" -> "entry workflow"
bentsherman Sep 26, 2024
5c0ec1d
Initial syntax reference page
bentsherman Sep 28, 2024
39ee1c9
Revert non-docs changes
bentsherman Sep 28, 2024
00ba5d1
Cleanup
bentsherman Sep 28, 2024
69772bf
Add section on script definitions
bentsherman Sep 28, 2024
24e65e0
Update config syntax docs
bentsherman Sep 28, 2024
e9faf58
Apply suggestions from code review
bentsherman Oct 1, 2024
23e05b7
Update docs/module.md
bentsherman Oct 1, 2024
1a173a0
Apply suggestions from code review
bentsherman Oct 1, 2024
74480fc
Add remaining sections to syntax page
bentsherman Sep 30, 2024
4aee05a
Apply suggestions from code review
bentsherman Oct 1, 2024
e073700
minor edits
bentsherman Oct 1, 2024
6a66f71
Apply suggestions from review
bentsherman Oct 3, 2024
0d91594
Apply suggestions from code review
bentsherman Oct 3, 2024
3d29edb
Update description of expressions
bentsherman Oct 8, 2024
71236c9
Suggestions for syntax page
christopher-hakkaart Oct 9, 2024
505c98e
Revert "Suggestions for syntax page"
christopher-hakkaart Oct 9, 2024
0645746
Apply suggestions from review
bentsherman Oct 17, 2024
9444ed1
Apply suggestions from review
bentsherman Oct 18, 2024
110ede7
Apply suggestions from review
bentsherman Oct 18, 2024
48ac8b5
Update docs/dsl1.md
bentsherman Oct 21, 2024
30ea996
Revert "Update docs/dsl1.md"
bentsherman Oct 21, 2024
93154ee
Update docs/config.md
bentsherman Oct 21, 2024
04a311d
Remove redundant subscriber code snippet
bentsherman Oct 21, 2024
63633d1
Add note about implicit closure parameter
bentsherman Oct 23, 2024
17b7ab5
Merge branch 'master' into docs-strict-syntax
pditommaso Oct 23, 2024
3f6f377
Update docs [ci skip]
pditommaso Oct 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/channel.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ process foo {

workflow {
result = foo(1)
result.view { "Result: ${it}" }
result.view { file -> "Result: ${file}" }
}
```

Expand Down
78 changes: 47 additions & 31 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,37 +22,65 @@ You can use the `-C <config-file>` option to use a single configuration file and

## Syntax
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

A Nextflow configuration file is a simple text file containing a set of properties defined using the syntax:
The Nextflow configuration syntax is based on the Nextflow script syntax. It is designed for setting configuration options in a declarative manner while also allowing for dynamic expressions where appropriate.

A Nextflow config file may consist of any number of *assignments*, *blocks*, and *includes*. Config files may also contain comments in the same manner as scripts.

See {ref}`syntax-page` for more information about the Nextflow script syntax.

### Assignments

A config assignment consists of a config option and an expression separated by an equals sign:

```groovy
name = value
workDir = 'work'
docker.enabled = true
process.maxErrors = 10
```

Please note, string values need to be wrapped in quotation characters while numbers and boolean values (`true`, `false`) do not. Also note that values are typed. This means that, for example, `1` is different from `'1'` — the former is interpreted as the number one, while the latter is interpreted as a string value.
A config option consists of an *option name* prefixed by any number of *scopes* separated by dots. Config scopes are used to group related config options. See {ref}`config-options` for the full set of config options.

The expression is typically a literal value such as a number, boolean, or string. However, any expression can be used:

### Variables
```groovy
params.helper_file = "${projectDir}/assets/helper.txt"
```

Configuration properties can be used as variables in the configuration file by using the usual `$propertyName` or `${expression}` syntax.
### Blocks

For example:
A config scope can also be specified as a block, which may contain multiple configuration options. For example:

```groovy
propertyOne = 'world'
anotherProp = "Hello $propertyOne"
customPath = "$PATH:/my/app/folder"
// dot syntax
docker.enabled = true
docker.runOptions = '-u $(id -u):$(id -g)'

// block syntax
docker {
enabled = true
runOptions = '-u $(id -u):$(id -g)'
}
```

Please note, the usual rules for {ref}`string-interpolation` are applied, thus a string containing a variable reference must be wrapped in double-quote chars instead of single-quote chars.
As a result, deeply nested config options can be assigned in various ways. For example, the following three assignments are equivalent:

The same mechanism allows you to access environment variables defined in the hosting system. Any variable name not defined in the Nextflow configuration file(s) is interpreted to be a reference to an environment variable with that name. So, in the above example, the property `customPath` is defined as the current system `PATH` to which the string `/my/app/folder` is appended.
```groovy
executor.retry.maxAttempt = 5

### Comments
executor {
retry.maxAttempt = 5
}

Configuration files use the same conventions for comments used by the Groovy or Java programming languages. Thus, use `//` to comment a single line, or `/*` .. `*/` to comment a block on multiple lines.
executor {
retry {
maxAttempt = 5
}
}
```

### Includes

A configuration file can include one or more configuration files using the keyword `includeConfig`. For example:
A config file can include any number of other config files using the `includeConfig` keyword:
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

```groovy
process.executor = 'sge'
Expand All @@ -62,7 +90,11 @@ process.memory = '10G'
includeConfig 'path/foo.config'
```

When a relative path is used, it is resolved against the actual location of the including file.
Relative paths are resolved against the location of the including file.

:::{note}
Config includes can also be specified within config blocks. However, config files should only be included at the top level or in a [profile](#config-profiles) so that the included config file is valid on its own and in the context in which it is included.
:::

## Constants

Expand All @@ -79,22 +111,6 @@ The following constants are globally available in a Nextflow configuration file:
`projectDir`
: The directory where the main script is located.

## Config scopes

Configuration settings can be organized in different scopes by dot prefixing the property names with a scope identifier, or grouping the properties in the same scope using the curly brackets notation. For example:

```groovy
alpha.x = 1
alpha.y = 'string value..'

beta {
p = 2
q = 'another string ..'
}
```

See {ref}`config-options` for the full list of config settings.

(config-params)=

## Parameters
Expand Down
4 changes: 3 additions & 1 deletion docs/developer/plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ This page describes how to create, test, and publish third-party plugins.

The best way to get started with your own plugin is to refer to the [nf-hello](https://github.com/nextflow-io/nf-hello) repository. This repository provides a minimal plugin implementation with several examples of different extension points and instructions for building, testing, and publishing.

Plugins can be written in Java or Groovy.

The minimal dependencies are as follows:

```groovy
Expand Down Expand Up @@ -151,7 +153,7 @@ Refer to the source code of Nextflow's built-in executors to see how to implemen
:::{versionadded} 22.09.0-edge
:::

Plugins can define custom Groovy functions, which can then be included into Nextflow pipelines.
Plugins can define custom functions, which can then be included in Nextflow pipelines.

To implement a custom function, create a class in your plugin that extends the `PluginExtensionPoint` class, and implement your function with the `Function` annotation:

Expand Down
2 changes: 1 addition & 1 deletion docs/dsl1.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ In DSL1, the entire Nextflow pipeline must be defined in a single file (e.g. `ma
DSL2 introduces the concept of "module scripts" (or "modules" for short), which are Nextflow scripts that can be "included" by other scripts. While modules are not essential to migrating to DSL2, nor are they mandatory in DSL2 by any means, modules can help you organize a large pipeline into multiple smaller files, and take advantage of modules created by others. Check out the {ref}`module-page` to get started.

:::{note}
With DSL2, the Groovy shell used by Nextflow also imposes a 64KB size limit on pipeline scripts, so if your DSL1 script is very large, you may need to split your script into modules anyway to avoid this limit.
DSL2 scripts cannot exceed 64 KB in size. Large DSL1 scripts may need to be split into modules to avoid this limit.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved
:::

## Deprecations
Expand Down
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ fusion
:caption: Reference
:maxdepth: 1

reference/syntax
reference/cli
reference/config
reference/env-vars
Expand Down
32 changes: 17 additions & 15 deletions docs/module.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@

# Modules

In Nextflow, a **module** is a script that may contain functions, processes, and workflows (collectively referred to as *components*). A module can be included in other modules or pipeline scripts and even shared across workflows.
Nextflow scripts can include **definitions** (workflows, processes, and functions) from other scripts. When a script is included in this way, it is referred to as a **module**. Modules can be included by other modules or pipeline scripts and can even be shared across workflows.
christopher-hakkaart marked this conversation as resolved.
Show resolved Hide resolved

:::{note}
Modules were introduced in DSL2. If you are still using DSL1, see the {ref}`dsl1-page` page to learn how to migrate your Nextflow pipelines to DSL2.
:::

## Module inclusion

A component defined in a module script can be imported into another Nextflow script using the `include` keyword.
You can include any definition from a module into a Nextflow script using the `include` keyword.

For example:

Expand All @@ -23,7 +23,7 @@ workflow {
}
```

The above snippet imports a process named `foo`, defined in the module script, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.
The above snippet imports a process named `foo`, defined in the module, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.

Nextflow implicitly looks for the script file `./some/module.nf`, resolving the path against the *including* script location.

Expand Down Expand Up @@ -57,7 +57,7 @@ Module directories allow the use of module scoped binaries scripts. See [Module

## Multiple inclusions

A Nextflow script can include any number of modules, and an `include` statement can import any number of components from a module. Multiple components can be included from the same module by using the syntax shown below:
A Nextflow script can include any number of modules, and an `include` statement can import any number of definitions from a module. Multiple definitions can be included from the same module by using the syntax shown below:

```groovy
include { foo; bar } from './some/module'
Expand All @@ -73,7 +73,7 @@ workflow {

## Module aliases

When including a module component, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:
When including definition from a module, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:

```groovy
include { foo } from './some/module'
Expand All @@ -85,7 +85,7 @@ workflow {
}
```

You can even include the same component multiple times under different names:
You can also include the same definition multiple times under different names:

```groovy
include { foo; foo as bar } from './some/module'
Expand All @@ -96,13 +96,15 @@ workflow {
}
```

(module-params)=

## Module parameters

:::{deprecated} 24.07.0-edge
As a best practice, parameters should be used in the entry workflow and passed to functions / processes / workflows as explicit inputs.
As a best practice, parameters should be used in the entry workflow and passed to workflows, processes, and functions as explicit inputs.
:::

A module script can define parameters using the same syntax as a Nextflow workflow script:
A module can define parameters using the same syntax as a Nextflow workflow script:

```groovy
params.foo = 'Hello'
Expand Down Expand Up @@ -182,9 +184,9 @@ Ciao world!

## Module templates

The module script can be defined in an external {ref}`template <process-template>` file. The template file can be placed in the `templates` directory where the module script is located.
Process script {ref}`templates <process-template>` can be included alongside a module in the `templates` directory.

For example, suppose we have a project L with a module script that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
For example, suppose we have a project L with a module that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

```
Project L
Expand All @@ -208,15 +210,15 @@ Pipeline B
└── main.nf
```

With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module script.
With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

Beside promoting the sharing of modules across pipelines, there are several advantages to keeping the module template under the script path:

1. module components are *self-contained*,
2. module components can be tested independently from the pipeline(s) that import them,
3. it is possible to create libraries of module components.
1. Modules are self-contained
2. Modules can be tested independently from the pipeline(s) that import them
3. Modules can be made into libraries

Ultimately, having multiple template locations allows a more structured organization within the same project. If a project has several module components, and all of them use templates, the project could group module scripts and their templates as needed. For example:
Having multiple template locations enables a structured project organization. If a project has several modules, and they all use templates, the project could group module scripts and their templates as needed. For example:

```
baseDir
Expand Down
6 changes: 2 additions & 4 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,9 @@ Read the {ref}`executor-page` to learn more about the Nextflow executors.

## Scripting language

Nextflow is designed to have a minimal learning curve, without having to pick up a new programming language. In most cases, users can utilise their current skills to develop Nextflow workflows. However, it also provides a powerful scripting DSL.
Nextflow is a workflow language based on [Java](https://en.wikipedia.org/wiki/Java_(programming_language)) and [Groovy](https://groovy-lang.org/). It is designed to simplify writing scalable and reproducible pipelines. In most cases, users can leverage their existing programming skills to develop Nextflow pipelines without the steep learning curve that usually comes with a new programming language.

Nextflow scripting is an extension of the [Groovy programming language](<http://en.wikipedia.org/wiki/Groovy_(programming_language)>), which in turn is a super-set of the Java programming language. Groovy can be considered as Python for Java in that it simplifies the writing of code and is more approachable.

Read the {ref}`script-page` section to learn about the Nextflow scripting language.
See {ref}`script-page` for more information about the Nextflow scripting language.

## Configuration options

Expand Down
Loading
Loading