Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove task action audit logging and druid_taskLog metadata table #16309

Merged
merged 16 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions docs/api-reference/tasks-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -914,13 +914,10 @@ Host: http://ROUTER_IP:ROUTER_PORT
### Get task segments

:::info
This API is deprecated and will be removed in future releases.
This API is not supported anymore and always returns a 404 response.
Use the metric `segment/added/bytes` instead to identify the segment IDs committed by a task.
:::

Retrieves information about segments generated by the task given the task ID. To hit this endpoint, make sure to enable the audit log config on the Overlord with `druid.indexer.auditLog.enabled = true`.

In addition to enabling audit logs, configure a cleanup strategy to prevent overloading the metadata store with old audit logs which may cause performance issues. To enable automated cleanup of audit logs on the Coordinator, set `druid.coordinator.kill.audit.on`. You may also manually export the audit logs to external storage. For more information, see [Audit records](../operations/clean-metadata-store.md#audit-records).

#### URL

`GET` `/druid/indexer/v1/task/{taskId}/segments`
Expand All @@ -929,12 +926,14 @@ In addition to enabling audit logs, configure a cleanup strategy to prevent over

<Tabs>

<TabItem value="27" label="200 SUCCESS">
<TabItem value="27" label="404 NOT FOUND">


<br/>

*Successfully retrieved task segments*
```json
{
"error": "Segment IDs committed by a task action are not persisted anymore. Use the metric 'segment/added/bytes' to identify the segments created by a task."
}
```

</TabItem>
</Tabs>
Expand Down
14 changes: 6 additions & 8 deletions docs/operations/clean-metadata-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ This applies to all metadata entities in this topic except compaction configurat
You can configure the retention period for each metadata type, when available, through the record's `durationToRetain` property.
Certain records may require additional conditions be satisfied before clean up occurs.

See the [example](#example) for how you can customize the automated metadata cleanup for a specific use case.
See the [example](#example-configuration-for-automated-metadata-cleanup) for how you can customize the automated metadata cleanup for a specific use case.


## Automated cleanup strategies
Expand All @@ -62,13 +62,12 @@ You can configure cleanup for each entity separately, as described in this secti
Define the properties in the `coordinator/runtime.properties` file.

The cleanup of one entity may depend on the cleanup of another entity as follows:
- You have to configure a [kill task for segment records](#kill-task) before you can configure automated cleanup for [rules](#rules-records) or [compaction configuration](#compaction-configuration-records).
- You have to configure a [kill task for segment records](#segment-records-and-segments-in-deep-storage-kill-task) before you can configure automated cleanup for [rules](#rules-records) or [compaction configuration](#compaction-configuration-records).
- You have to schedule the metadata management tasks to run at the same or higher frequency as your most frequent cleanup job. For example, if your most frequent cleanup job is every hour, set the metadata store management period to one hour or less: `druid.coordinator.period.metadataStoreManagementPeriod=P1H`.

For details on configuration properties, see [Metadata management](../configuration/index.md#metadata-management).
If you want to skip the details, check out the [example](#example) for configuring automated metadata cleanup.
If you want to skip the details, check out the [example](#example-configuration-for-automated-metadata-cleanup) for configuring automated metadata cleanup.

<a name="kill-task"></a>
### Segment records and segments in deep storage (kill task)

:::info
Expand Down Expand Up @@ -110,7 +109,7 @@ Supervisor cleanup uses the following configuration:

### Rules records

Rule records become eligible for deletion when all segments for the datasource have been killed by the kill task and the `durationToRetain` time has passed since their creation. Automated cleanup for rules requires a [kill task](#kill-task).
Rule records become eligible for deletion when all segments for the datasource have been killed by the kill task and the `durationToRetain` time has passed since their creation. Automated cleanup for rules requires a [kill task](#segment-records-and-segments-in-deep-storage-kill-task).

Rule cleanup uses the following configuration:
- `druid.coordinator.kill.rule.on`: When `true`, enables cleanup for rules records.
Expand All @@ -129,7 +128,7 @@ To prevent the configuration from being prematurely removed, wait for the dataso

Unlike other metadata records, compaction configuration records do not have a retention period set by `durationToRetain`. Druid deletes compaction configuration records at every cleanup cycle for inactive datasources, which do not have segments either used or unused.

Compaction configuration records in the `druid_config` table become eligible for deletion after all segments for the datasource have been killed by the kill task. Automated cleanup for compaction configuration requires a [kill task](#kill-task).
Compaction configuration records in the `druid_config` table become eligible for deletion after all segments for the datasource have been killed by the kill task. Automated cleanup for compaction configuration requires a [kill task](#segment-records-and-segments-in-deep-storage-kill-task).

Compaction configuration cleanup uses the following configuration:
- `druid.coordinator.kill.compaction.on`: When `true`, enables cleanup for compaction configuration records.
Expand All @@ -153,7 +152,7 @@ Datasource cleanup uses the following configuration:

You can configure the Overlord to periodically delete indexer task logs and associated metadata. During cleanup, the Overlord removes the following:
* Indexer task logs from deep storage.
* Indexer task log metadata from the tasks and tasklogs tables in [metadata storage](../configuration/index.md#metadata-storage) (named `druid_tasks` and `druid_tasklogs` by default). Druid no longer uses the tasklogs table, and the table is always empty.
* Indexer task log metadata from the tasks table in [metadata storage](../configuration/index.md#metadata-storage) (named `druid_tasks` by default).

To configure cleanup of task logs by the Overlord, set the following properties in the `overlord/runtime.properties` file.

Expand Down Expand Up @@ -188,7 +187,6 @@ druid.coordinator.kill.rule.on=false
druid.coordinator.kill.datasource.on=false
```

<a name="example"></a>
## Example configuration for automated metadata cleanup

Consider a scenario where you have scripts to create and delete hundreds of datasources and related entities a day. You do not want to fill your metadata store with leftover records. The datasources and related entities tend to persist for only one or two days. Therefore, you want to run a cleanup job that identifies and removes leftover records that are at least four days old after a seven day buffer period in case you want to recover the data. The exception is for audit logs, which you need to retain for 30 days:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,12 +121,6 @@ public Boolean perform(Task task, TaskActionToolbox toolbox)
);
}

@Override
public boolean isAudited()
{
return true;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,6 @@
import com.fasterxml.jackson.databind.ObjectMapper;
import org.apache.druid.indexing.common.task.IndexTaskUtils;
import org.apache.druid.indexing.common.task.Task;
import org.apache.druid.indexing.overlord.TaskStorage;
import org.apache.druid.java.util.common.ISE;
import org.apache.druid.java.util.common.jackson.JacksonUtils;
import org.apache.druid.java.util.emitter.EmittingLogger;
import org.apache.druid.java.util.emitter.service.ServiceMetricEvent;
Expand All @@ -37,45 +35,21 @@ public class LocalTaskActionClient implements TaskActionClient
private static final EmittingLogger log = new EmittingLogger(LocalTaskActionClient.class);

private final Task task;
private final TaskStorage storage;
private final TaskActionToolbox toolbox;
private final TaskAuditLogConfig auditLogConfig;

public LocalTaskActionClient(
Task task,
TaskStorage storage,
TaskActionToolbox toolbox,
TaskAuditLogConfig auditLogConfig
TaskActionToolbox toolbox
)
{
this.task = task;
this.storage = storage;
this.toolbox = toolbox;
this.auditLogConfig = auditLogConfig;
}

@Override
public <RetType> RetType submit(TaskAction<RetType> taskAction)
{
log.debug("Performing action for task[%s]: %s", task.getId(), taskAction);

if (auditLogConfig.isEnabled() && taskAction.isAudited()) {
// Add audit log
try {
final long auditLogStartTime = System.currentTimeMillis();
storage.addAuditLog(task, taskAction);
emitTimerMetric("task/action/log/time", taskAction, System.currentTimeMillis() - auditLogStartTime);
}
catch (Exception e) {
final String actionClass = taskAction.getClass().getName();
log.makeAlert(e, "Failed to record action in audit log")
.addData("task", task.getId())
.addData("actionClass", actionClass)
.emit();
throw new ISE(e, "Failed to record action [%s] in audit log", actionClass);
}
}

final long performStartTime = System.currentTimeMillis();
final RetType result = performAction(taskAction);
emitTimerMetric("task/action/run/time", taskAction, System.currentTimeMillis() - performStartTime);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,27 +21,22 @@

import com.google.inject.Inject;
import org.apache.druid.indexing.common.task.Task;
import org.apache.druid.indexing.overlord.TaskStorage;

/**
*/
public class LocalTaskActionClientFactory implements TaskActionClientFactory
{
private final TaskStorage storage;
private final TaskActionToolbox toolbox;
private final TaskAuditLogConfig auditLogConfig;

@Inject
public LocalTaskActionClientFactory(TaskStorage storage, TaskActionToolbox toolbox, TaskAuditLogConfig auditLogConfig)
public LocalTaskActionClientFactory(TaskActionToolbox toolbox)
{
this.storage = storage;
this.toolbox = toolbox;
this.auditLogConfig = auditLogConfig;
}

@Override
public TaskActionClient create(Task task)
{
return new LocalTaskActionClient(task, storage, toolbox, auditLogConfig);
return new LocalTaskActionClient(task, toolbox);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,6 @@ public List<TaskLock> perform(Task task, TaskActionToolbox toolbox)
return toolbox.getTaskLockbox().findLocksForTask(task);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,6 @@ public Void perform(Task task, TaskActionToolbox toolbox)
return null;
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,4 @@ public Integer perform(Task task, TaskActionToolbox toolbox)
.markSegmentsAsUnusedWithinInterval(dataSource, interval);
}

@Override
public boolean isAudited()
{
return true;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -64,12 +64,6 @@ public Boolean perform(Task task, TaskActionToolbox toolbox)
return toolbox.getSupervisorManager().resetSupervisor(dataSource, resetMetadata);
}

@Override
public boolean isAudited()
{
return true;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,6 @@ public Set<DataSegment> perform(Task task, TaskActionToolbox toolbox)
.retrieveSegmentsById(dataSource, segmentIds);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public boolean equals(Object o)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,12 +101,6 @@ public List<DataSegment> perform(Task task, TaskActionToolbox toolbox)
.retrieveUnusedSegmentsForInterval(dataSource, interval, versions, limit, maxUsedStatusLastUpdatedTime);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,12 +186,6 @@ private Set<DataSegment> retrieveUsedSegments(TaskActionToolbox toolbox)
.retrieveUsedSegmentsForIntervals(dataSource, intervals, visibility);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public boolean equals(Object o)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -386,12 +386,6 @@ private SegmentIdWithShardSpec tryAllocate(
}
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down

This file was deleted.

Loading
Loading