Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document formal script syntax #5336

Merged
merged 29 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
9d6c826
Update docs and tests with strict syntax
bentsherman Sep 25, 2024
189b560
Remove some references to Groovy
bentsherman Sep 26, 2024
24bb403
Rename "implicit workflow" -> "entry workflow"
bentsherman Sep 26, 2024
5c0ec1d
Initial syntax reference page
bentsherman Sep 28, 2024
39ee1c9
Revert non-docs changes
bentsherman Sep 28, 2024
00ba5d1
Cleanup
bentsherman Sep 28, 2024
69772bf
Add section on script definitions
bentsherman Sep 28, 2024
24e65e0
Update config syntax docs
bentsherman Sep 28, 2024
e9faf58
Apply suggestions from code review
bentsherman Oct 1, 2024
23e05b7
Update docs/module.md
bentsherman Oct 1, 2024
1a173a0
Apply suggestions from code review
bentsherman Oct 1, 2024
74480fc
Add remaining sections to syntax page
bentsherman Sep 30, 2024
4aee05a
Apply suggestions from code review
bentsherman Oct 1, 2024
e073700
minor edits
bentsherman Oct 1, 2024
6a66f71
Apply suggestions from review
bentsherman Oct 3, 2024
0d91594
Apply suggestions from code review
bentsherman Oct 3, 2024
3d29edb
Update description of expressions
bentsherman Oct 8, 2024
71236c9
Suggestions for syntax page
christopher-hakkaart Oct 9, 2024
505c98e
Revert "Suggestions for syntax page"
christopher-hakkaart Oct 9, 2024
0645746
Apply suggestions from review
bentsherman Oct 17, 2024
9444ed1
Apply suggestions from review
bentsherman Oct 18, 2024
110ede7
Apply suggestions from review
bentsherman Oct 18, 2024
48ac8b5
Update docs/dsl1.md
bentsherman Oct 21, 2024
30ea996
Revert "Update docs/dsl1.md"
bentsherman Oct 21, 2024
93154ee
Update docs/config.md
bentsherman Oct 21, 2024
04a311d
Remove redundant subscriber code snippet
bentsherman Oct 21, 2024
63633d1
Add note about implicit closure parameter
bentsherman Oct 23, 2024
17b7ab5
Merge branch 'master' into docs-strict-syntax
pditommaso Oct 23, 2024
3f6f377
Update docs [ci skip]
pditommaso Oct 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/channel.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ process foo {

workflow {
result = foo(1)
result.view { "Result: ${it}" }
result.view { file -> "Result: ${file}" }
}
```

Expand Down
76 changes: 45 additions & 31 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,37 +22,63 @@ You can use the `-C <config-file>` option to use a single configuration file and

## Syntax
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

A Nextflow configuration file is a simple text file containing a set of properties defined using the syntax:
The Nextflow configuration syntax is based on the Nextflow script syntax. It is designed for setting configuration options in a declarative manner while also allowing for dynamic expressions where appropriate. Refer to {ref}`syntax-page`, particularly the sections on comments and expressions, for a full description of the available syntax.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

A Nextflow config file consists of any number of *assignments*, *blocks*, and *includes*. Config files may also contain comments in the same manner as scripts.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

### Assignments

A config assignment consists of a config option and an expression separated by an equals sign:

```groovy
name = value
workDir = 'work'
docker.enabled = true
process.maxErrors = 10
```

Please note, string values need to be wrapped in quotation characters while numbers and boolean values (`true`, `false`) do not. Also note that values are typed. This means that, for example, `1` is different from `'1'` — the former is interpreted as the number one, while the latter is interpreted as a string value.
A config option consists of one or more names separated by dots. The names other than the last one are known as *config scopes*. See {ref}`config-options` for the full set of config options organized by scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be explained in more detail. It's not clear what is and isn't a scope, and if it's not a scope, what it is called and where more information about it can be found. I am adding a comment for now and will come back and try to write something clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading below, I think a larger description of a scope would be helpful here; let me know what you think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some clarification here, let me know what you think


### Variables
The right-hand side is typically a literal value such as a number, boolean, or string, but can be any expression, such as a dynamic string:
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

Configuration properties can be used as variables in the configuration file by using the usual `$propertyName` or `${expression}` syntax.
```groovy
params.helper_file = "${projectDir}/assets/helper.txt"
```

For example:
### Blocks

A config scope can also be specified as a block, in which case any number of config options within that scope can be assigned:
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

```groovy
propertyOne = 'world'
anotherProp = "Hello $propertyOne"
customPath = "$PATH:/my/app/folder"
// dot syntax
docker.enabled = true
docker.runOptions = '-u $(id -u):$(id -g)'

// block syntax
docker {
enabled = true
runOptions = '-u $(id -u):$(id -g)'
}
```

Please note, the usual rules for {ref}`string-interpolation` are applied, thus a string containing a variable reference must be wrapped in double-quote chars instead of single-quote chars.
As a result, deeply nested config options can be assigned in a variety of ways. For example, the following three assignments are equivalent:
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

The same mechanism allows you to access environment variables defined in the hosting system. Any variable name not defined in the Nextflow configuration file(s) is interpreted to be a reference to an environment variable with that name. So, in the above example, the property `customPath` is defined as the current system `PATH` to which the string `/my/app/folder` is appended.
```groovy
executor.retry.maxAttempt = 5

### Comments
executor {
retry.maxAttempt = 5
}

Configuration files use the same conventions for comments used by the Groovy or Java programming languages. Thus, use `//` to comment a single line, or `/*` .. `*/` to comment a block on multiple lines.
executor {
retry {
maxAttempt = 5
}
}
```

### Includes

A configuration file can include one or more configuration files using the keyword `includeConfig`. For example:
A config file can include any number of other config files using the `includeConfig` keyword:
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

```groovy
process.executor = 'sge'
Expand All @@ -62,7 +88,11 @@ process.memory = '10G'
includeConfig 'path/foo.config'
```

When a relative path is used, it is resolved against the actual location of the including file.
When a relative path is used, it is resolved against the location of the including file.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

:::{note}
Config includes can also be specified within config blocks. However, config files should only be included either at the top-level or in a [profile](#config-profiles), so that the included config file is valid both on its own and in the context in which it is included.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved
:::

## Constants

Expand All @@ -79,22 +109,6 @@ The following constants are globally available in a Nextflow configuration file:
`projectDir`
: The directory where the main script is located.

## Config scopes

Configuration settings can be organized in different scopes by dot prefixing the property names with a scope identifier, or grouping the properties in the same scope using the curly brackets notation. For example:

```groovy
alpha.x = 1
alpha.y = 'string value..'

beta {
p = 2
q = 'another string ..'
}
```

See {ref}`config-options` for the full list of config settings.

(config-params)=

## Parameters
Expand Down
4 changes: 3 additions & 1 deletion docs/developer/plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ This page describes how to create, test, and publish third-party plugins.

The best way to get started with your own plugin is to refer to the [nf-hello](https://github.com/nextflow-io/nf-hello) repository. This repository provides a minimal plugin implementation with several examples of different extension points and instructions for building, testing, and publishing.

Plugins can be written in Java or Groovy.

The minimal dependencies are as follows:

```groovy
Expand Down Expand Up @@ -151,7 +153,7 @@ Refer to the source code of Nextflow's built-in executors to see how to implemen
:::{versionadded} 22.09.0-edge
:::

Plugins can define custom Groovy functions, which can then be included into Nextflow pipelines.
Plugins can define custom functions, which can then be included into Nextflow pipelines.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

To implement a custom function, create a class in your plugin that extends the `PluginExtensionPoint` class, and implement your function with the `Function` annotation:

Expand Down
2 changes: 1 addition & 1 deletion docs/dsl1.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ In DSL1, the entire Nextflow pipeline must be defined in a single file (e.g. `ma
DSL2 introduces the concept of "module scripts" (or "modules" for short), which are Nextflow scripts that can be "included" by other scripts. While modules are not essential to migrating to DSL2, nor are they mandatory in DSL2 by any means, modules can help you organize a large pipeline into multiple smaller files, and take advantage of modules created by others. Check out the {ref}`module-page` to get started.

:::{note}
With DSL2, the Groovy shell used by Nextflow also imposes a 64KB size limit on pipeline scripts, so if your DSL1 script is very large, you may need to split your script into modules anyway to avoid this limit.
With DSL2, Nextflow scripts cannot exceed 64KB in size, so if your DSL1 script is very large, you may need to split your script into modules anyway to avoid this limit.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved
:::

## Deprecations
Expand Down
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ fusion
:caption: Reference
:maxdepth: 1

reference/syntax
reference/cli
reference/config
reference/env-vars
Expand Down
32 changes: 17 additions & 15 deletions docs/module.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@

# Modules

In Nextflow, a **module** is a script that may contain functions, processes, and workflows (collectively referred to as *components*). A module can be included in other modules or pipeline scripts and even shared across workflows.
Nextflow scripts can include **definitions** (workflows, processes, and functions) from other scripts. When a script is included in this way, it is referred to as a **module**. Modules can be included by other modules or pipeline scripts and can even be shared across workflows.
christopher-hakkaart marked this conversation as resolved.
Show resolved Hide resolved

:::{note}
Modules were introduced in DSL2. If you are still using DSL1, see the {ref}`dsl1-page` page to learn how to migrate your Nextflow pipelines to DSL2.
:::

## Module inclusion

A component defined in a module script can be imported into another Nextflow script using the `include` keyword.
You can include any definition from a module into a Nextflow script using the `include` keyword.

For example:

Expand All @@ -23,7 +23,7 @@ workflow {
}
```

The above snippet imports a process named `foo`, defined in the module script, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.
The above snippet imports a process named `foo`, defined in the module, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.

Nextflow implicitly looks for the script file `./some/module.nf`, resolving the path against the *including* script location.

Expand Down Expand Up @@ -57,7 +57,7 @@ Module directories allow the use of module scoped binaries scripts. See [Module

## Multiple inclusions

A Nextflow script can include any number of modules, and an `include` statement can import any number of components from a module. Multiple components can be included from the same module by using the syntax shown below:
A Nextflow script can include any number of modules, and an `include` statement can import any number of definitions from a module. Multiple definitions can be included from the same module by using the syntax shown below:

```groovy
include { foo; bar } from './some/module'
Expand All @@ -73,7 +73,7 @@ workflow {

## Module aliases

When including a module component, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:
When including definition from a module, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:

```groovy
include { foo } from './some/module'
Expand All @@ -85,7 +85,7 @@ workflow {
}
```

You can even include the same component multiple times under different names:
You can also include the same definition multiple times under different names:

```groovy
include { foo; foo as bar } from './some/module'
Expand All @@ -96,13 +96,15 @@ workflow {
}
```

(module-params)=

## Module parameters

:::{deprecated} 24.07.0-edge
As a best practice, parameters should be used in the entry workflow and passed to functions / processes / workflows as explicit inputs.
As a best practice, parameters should be used in the entry workflow and passed to workflows, processes, and functions as explicit inputs.
:::

A module script can define parameters using the same syntax as a Nextflow workflow script:
A module can define parameters using the same syntax as a Nextflow workflow script:

```groovy
params.foo = 'Hello'
Expand Down Expand Up @@ -182,9 +184,9 @@ Ciao world!

## Module templates

The module script can be defined in an external {ref}`template <process-template>` file. The template file can be placed in the `templates` directory where the module script is located.
Process script {ref}`templates <process-template>` can be included alongside a module in the `templates` directory.

For example, suppose we have a project L with a module script that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
For example, suppose we have a project L with a module that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

```
Project L
Expand All @@ -208,15 +210,15 @@ Pipeline B
└── main.nf
```

With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module script.
With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

Beside promoting the sharing of modules across pipelines, there are several advantages to keeping the module template under the script path:

1. module components are *self-contained*,
2. module components can be tested independently from the pipeline(s) that import them,
3. it is possible to create libraries of module components.
1. Modules are self-contained
2. Modules can be tested independently from the pipeline(s) that import them
3. Modules can be made into libraries

Ultimately, having multiple template locations allows a more structured organization within the same project. If a project has several module components, and all of them use templates, the project could group module scripts and their templates as needed. For example:
Having multiple template locations enables a structured project organization. If a project has several modules, and they all use templates, the project could group module scripts and their templates as needed. For example:

```
baseDir
Expand Down
6 changes: 2 additions & 4 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,9 @@ Read the {ref}`executor-page` to learn more about the Nextflow executors.

## Scripting language

Nextflow is designed to have a minimal learning curve, without having to pick up a new programming language. In most cases, users can utilise their current skills to develop Nextflow workflows. However, it also provides a powerful scripting DSL.
Nextflow is a workflow language, based on [Java](https://en.wikipedia.org/wiki/Java_(programming_language)) and [Groovy](https://groovy-lang.org/), which is designed to make it as simple as possible to write scalable and reproducible pipelines. In most cases, users can leverage their existing programming skills to develop Nextflow pipelines, without the steep learning curve that usually comes with a new programming language.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

Nextflow scripting is an extension of the [Groovy programming language](<http://en.wikipedia.org/wiki/Groovy_(programming_language)>), which in turn is a super-set of the Java programming language. Groovy can be considered as Python for Java in that it simplifies the writing of code and is more approachable.

Read the {ref}`script-page` section to learn about the Nextflow scripting language.
Read the {ref}`script-page` page to learn about the Nextflow scripting language.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

## Configuration options

Expand Down
Loading
Loading