Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document experimental csv module #1724

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/Options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: 'Options'
description: 'Options represents the configuration for CSV parsing.'
weight: 40
---

# Options

The `Options` object describes the configuration available for the operation of parsing CSV files using the [`csv.parse`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function and the [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) class.

## Properties

| Property | Type | Description |
| :------------ | :---------------- | :-------------------------------------------------------------------------------------------------------- |
| delimiter | string | The character used to separate fields in the CSV file. Default is `','`. |
| skipFirstLine | boolean | Whether to skip the first line of the CSV file. Default is `false`. |
| fromLine | (optional) number | The line number from which to start reading the CSV file. Default is `0`. |
| toLine | (optional) number | The line number at which to stop reading the CSV file. If the option is not set, then read until the end. |

## Example

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file, {
delimiter: ',',
skipFirstLine: true,
fromLine: 2,
toLine: 8,
});
})();

export default async function () {
// The `next` method attempts to read the next row from the CSV file.
//
// It returns an iterator-like object with a `done` property that indicates whether
// there are more rows to read, and a `value` property that contains the row fields
// as an array.
const { done, value } = await parser.next();
if (done) {
throw new Error('No more rows to read');
}

// We expect the `value` property to be an array of strings, where each string is a field
// from the CSV record.
console.log(done, value);
}
```

{{< /code >}}
96 changes: 96 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/Parser.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
title: 'Parser'
description: 'A CSV parser for streaming CSV parsing, allowing line-by-line reading with minimal memory consumption.'
weight: 30
---

# Parser

The `csv.Parser` class provides a streaming parser that reads CSV files line-by-line, offering fine-grained control over the parsing process and minimizing memory consumption.
It's well-suited for scenarios where memory efficiency is crucial or when you need to process large CSV files without loading the entire file into memory.

## Asynchronous nature

The `csv.Parser` class methods are asynchronous and return Promises.
Due to k6's current limitation with the [init context](https://grafana.com/docs/k6/<K6_VERSION>/using-k6/test-lifecycle/#the-init-stage) (which doesn't support asynchronous functions directly), you need to use an asynchronous wrapper such as:

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file);
})();
```

{{< /code >}}

## Constructor

| Parameter | Type | Description |
| :-------- | :-------------------------------------------------------------------------------------------- | :-------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| file | [fs.File](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/file) | A file instance opened using the fs.open function. |
| options | [Options](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/options) | An optional parameter object to customize the parsing behavior. | An optional parameter object to customize the parsing behavior. Options can include delimiter (string). |

### Methods

| Name | Description |
| :------- | :---------------------------------------------------------------------------------------------------- |
| `next()` | Reads the next line from the CSV file and returns a promise that resolves to an iterator-like object. |

### Returns

A promise resolving to an object with the following properties:

| Property | Type | Description |
| :------- | :------- | :---------------------------------------------------------------------------------------------------- |
| done | boolean | Indicates whether there are more rows to read (false) or the end of the file has been reached (true). |
| value | string[] | Contains the fields of the CSV record as an array of strings. If done is true, value is undefined. |

## Example

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

export const options = {
iterations: 10,
};

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file, { skipFirstLine: true });
})();

export default async function () {
// The `next` method attempts to read the next row from the CSV file.
//
// It returns an iterator-like object with a `done` property that indicates whether
// there are more rows to read, and a `value` property that contains the row fields
// as an array.
const { done, value } = await parser.next();
if (done) {
throw new Error('No more rows to read');
}

// We expect the `value` property to be an array of strings, where each string is a field
// from the CSV record.
console.log(done, value);
}
```

{{< /code >}}

## Notes on usage

- **Memory efficiency**: Since `csv.Parser` reads the file line-by-line, it keeps memory usage low and avoids loading the entire set of records into memory. This is particularly useful for large CSV files.
- **Streaming control**: The streaming approach provides more control over how records are processed, which can be beneficial for complex data handling requirements.
112 changes: 112 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
title: 'csv'
description: 'k6 csv experimental API'
weight: 10
---

# csv

{{< docs/shared source="k6" lookup="experimental-module.md" version="<K6_VERSION>" >}}

The `k6-experimental/csv` module provides efficient ways to handle CSV files in k6, offering faster parsing and lower memory
usage compared to traditional JavaScript-based libraries.

This module includes functionalities for both full-file parsing and streaming, allowing users to choose between
performance and memory optimization.

## Key features

- The [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function parses a complete CSV file into a SharedArray, leveraging Go-based processing for better performance and reduced memory footprint compared to JavaScript alternatives.
- The [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) class is a streaming parser that reads CSV files line-by-line, optimizing memory usage and giving more control over the parsing process through a stream-like API.

### Benefits

- **Faster parsing**: The [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function bypasses the JavaScript runtime, significantly speeding up parsing for large CSV files.
- **Lower memory usage**: Both [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) and [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) support shared memory across virtual users (VUs) when using the [`fs.open()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/open) function.
- **Flexibility**: Users can choose between full-file parsing with [`csv.parse`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) for speed or line-by-line streaming with [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) for memory efficiency.

### Trade-offs

- The [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function parses the entire file during the initialization phase, which might increase startup time and memory usage for large files. Best for scenarios where performance is more important than memory consumption.
- The [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) class processes the file line-by-line, making it more memory-efficient but potentially slower due to the overhead of reading each line. Suitable for scenarios where memory usage is critical or more granular control over parsing is needed.

## API

| Function/Object | Description |
| ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------- |
| [csv.parse()](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) | Parses an entire CSV file into a SharedArray for high-performance scenarios. |
| [csv.Parser](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) | A class for streaming CSV parsing, allowing line-by-line reading with minimal memory consumption. |

## Example

### Parsing a full CSV File into a SharedArray

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';
import { scenario } from 'k6/execution';

export const options = {
iterations: 10,
};

let file;
let csvRecords;
(async function () {
file = await open('data.csv');

// The `csv.parse` function consumes the entire file at once and returns
// the parsed records as a `SharedArray` object.
csvRecords = await csv.parse(file, { delimiter: ',' });
})();

export default async function () {
// `csvRecords` is a `SharedArray`. Each element is a record from the CSV file, represented as an array
// where each element is a field from the CSV record.
//
// Thus, `csvRecords[scenario.iterationInTest]` will give us the record for the current iteration.
console.log(csvRecords[scenario.iterationInTest]);
}
```

{{< /code >}}

### Streaming a CSV file line-by-line

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

export const options = {
iterations: 10,
};

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file);
})();

export default async function () {
// The parser `next` method attempts to read the next row from the CSV file.
//
// It returns an iterator-like object with a `done` property that indicates whether
// there are more rows to read, and a `value` property that contains the row fields
// as an array.
const { done, value } = await parser.next();
if (done) {
throw new Error('No more rows to read');
}

// We expect the `value` property to be an array of strings, where each string is a field
// from the CSV record.
console.log(done, value);
}
```

{{< /code >}}
82 changes: 82 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/parse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
title: 'parse( file, [options] )'
description: 'parse a CSV file into a SharedArray'
weight: 20
---

# parse( file, [options] )

The `csv.parse` function parses an entire CSV file at once and returns a promise that resolves to a [SharedArray](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-data/sharedarray) instance.
This function uses Go-based processing, which results in faster parsing and lower memory usage compared to JavaScript alternatives.
It's ideal for scenarios where performance is a priority, and the entire CSV file can be loaded into memory.

## Asynchronous Nature

`csv.parse` is an asynchronous function that returns a Promise. Due to k6's current limitation with the [init context](https://grafana.com/docs/k6/<K6_VERSION>/using-k6/test-lifecycle/) (which
doesn't support asynchronous functions directly), you need to use an asynchronous wrapper like this:

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

let file;
let csvRecords;
(async function () {
file = await open('data.csv');
csvRecords = await csv.parse(file, { delimiter: ',' });
})();
```

{{< /code >}}

## Parameters

| Parameter | Type | Description |
| :-------- | :-------------------------------------------------------------------------------------------- | :-------------------------------------------------------------- |
| file | [fs.File](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/file) | A file instance opened using the `fs.open` function. |
| options | [Options](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/options) | An optional parameter object to customize the parsing behavior. |

## Returns

A promise resolving to a [SharedArray](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-data/sharedarray) instance, where each element is an array representing a CSV record, and each sub-element is a field from that record.

## Example

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';
import { scenario } from 'k6/execution';

export const options = {
iterations: 10,
};

let file;
let csvRecords;
(async function () {
file = await open('data.csv');

// The `csv.parse` function consumes the entire file at once and returns
// the parsed records as a `SharedArray` object.
csvRecords = await csv.parse(file, { skipFirstLine: true });
})();

export default async function () {
// `csvRecords` is a `SharedArray`. Each element is a record from the CSV file, represented as an array
// where each element is a field from the CSV record.
//
// `csvRecords[scenario.iterationInTest]` gives the record for the current iteration.
console.log(csvRecords[scenario.iterationInTest]);
}
```

{{< /code >}}

## Notes on Usage

- **Memory considerations**: `csv.parse` loads the entire CSV file into memory at once, which may lead to increased memory usage and startup time for very large files.
- **Shared memory usage**: The [SharedArray](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-data/sharedarray) returned by `csv.parse` is shared among all Virtual Users (VUs), reducing memory overhead when multiple VUs access the same data.
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ title: javascript-api/k6-experimental

| Modules | Description |
| ------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------- |
| [csv](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs) | Provides support for efficient and convinient of parsing CSV files. |
| [fs](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs) | Provides a memory-efficient way to handle file interactions within your test scripts. |
| [redis](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/redis) | Functionality to interact with [Redis](https://redis.io/). |
| [streams](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/streams) | Provides an implementation of the Streams API specification, offering support for defining and consuming readable streams. |
Expand Down
Loading