Skip to content

Commit

Permalink
Remove task action audit logging and druid_taskLog metadata table (ap…
Browse files Browse the repository at this point in the history
…ache#16309)

Description:
Task action audit logging was first deprecated and disabled by default in Druid 0.13, apache#6368.

As called out in the original discussion apache#5859, there are several drawbacks to persisting task action audit logs. 
- Only usage of the task audit logs is to serve the API `/indexer/v1/task/{taskId}/segments`
which returns the list of segments created by a task.
- The use case is really narrow and no prod clusters really use this information.
- There can be better ways of obtaining this information, such as the metric
`segment/added/bytes` which reports both the segment ID and task ID
when a segment is committed by a task. We could also include committed segment IDs in task reports.
- A task persisting several segments would bloat up the audit logs table putting unnecessary strain
on metadata storage.

Changes:
- Remove `TaskAuditLogConfig`
- Remove method `TaskAction.isAudited()`. No task action is audited anymore.
- Remove `SegmentInsertAction` as it is not used anymore. `SegmentTransactionalInsertAction`
is the new incarnation which has been in use for a while.
- Deprecate `MetadataStorageActionHandler.addLog()` and `getLogs()`. These are not used anymore
but need to be retained for backward compatibility of extensions.
- Do not create `druid_taskLog` metadata table anymore.
  • Loading branch information
kfaraz authored and edgar2020 committed Jul 19, 2024
1 parent d1bf956 commit f9742bf
Show file tree
Hide file tree
Showing 52 changed files with 386 additions and 1,257 deletions.
17 changes: 8 additions & 9 deletions docs/api-reference/tasks-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -914,13 +914,10 @@ Host: http://ROUTER_IP:ROUTER_PORT
### Get task segments

:::info
This API is deprecated and will be removed in future releases.
This API is not supported anymore and always returns a 404 response.
Use the metric `segment/added/bytes` instead to identify the segment IDs committed by a task.
:::

Retrieves information about segments generated by the task given the task ID. To hit this endpoint, make sure to enable the audit log config on the Overlord with `druid.indexer.auditLog.enabled = true`.

In addition to enabling audit logs, configure a cleanup strategy to prevent overloading the metadata store with old audit logs which may cause performance issues. To enable automated cleanup of audit logs on the Coordinator, set `druid.coordinator.kill.audit.on`. You may also manually export the audit logs to external storage. For more information, see [Audit records](../operations/clean-metadata-store.md#audit-records).

#### URL

`GET` `/druid/indexer/v1/task/{taskId}/segments`
Expand All @@ -929,12 +926,14 @@ In addition to enabling audit logs, configure a cleanup strategy to prevent over

<Tabs>

<TabItem value="27" label="200 SUCCESS">
<TabItem value="27" label="404 NOT FOUND">


<br/>

*Successfully retrieved task segments*
```json
{
"error": "Segment IDs committed by a task action are not persisted anymore. Use the metric 'segment/added/bytes' to identify the segments created by a task."
}
```

</TabItem>
</Tabs>
Expand Down
14 changes: 6 additions & 8 deletions docs/operations/clean-metadata-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ This applies to all metadata entities in this topic except compaction configurat
You can configure the retention period for each metadata type, when available, through the record's `durationToRetain` property.
Certain records may require additional conditions be satisfied before clean up occurs.

See the [example](#example) for how you can customize the automated metadata cleanup for a specific use case.
See the [example](#example-configuration-for-automated-metadata-cleanup) for how you can customize the automated metadata cleanup for a specific use case.


## Automated cleanup strategies
Expand All @@ -62,13 +62,12 @@ You can configure cleanup for each entity separately, as described in this secti
Define the properties in the `coordinator/runtime.properties` file.

The cleanup of one entity may depend on the cleanup of another entity as follows:
- You have to configure a [kill task for segment records](#kill-task) before you can configure automated cleanup for [rules](#rules-records) or [compaction configuration](#compaction-configuration-records).
- You have to configure a [kill task for segment records](#segment-records-and-segments-in-deep-storage-kill-task) before you can configure automated cleanup for [rules](#rules-records) or [compaction configuration](#compaction-configuration-records).
- You have to schedule the metadata management tasks to run at the same or higher frequency as your most frequent cleanup job. For example, if your most frequent cleanup job is every hour, set the metadata store management period to one hour or less: `druid.coordinator.period.metadataStoreManagementPeriod=P1H`.

For details on configuration properties, see [Metadata management](../configuration/index.md#metadata-management).
If you want to skip the details, check out the [example](#example) for configuring automated metadata cleanup.
If you want to skip the details, check out the [example](#example-configuration-for-automated-metadata-cleanup) for configuring automated metadata cleanup.

<a name="kill-task"></a>
### Segment records and segments in deep storage (kill task)

:::info
Expand Down Expand Up @@ -110,7 +109,7 @@ Supervisor cleanup uses the following configuration:

### Rules records

Rule records become eligible for deletion when all segments for the datasource have been killed by the kill task and the `durationToRetain` time has passed since their creation. Automated cleanup for rules requires a [kill task](#kill-task).
Rule records become eligible for deletion when all segments for the datasource have been killed by the kill task and the `durationToRetain` time has passed since their creation. Automated cleanup for rules requires a [kill task](#segment-records-and-segments-in-deep-storage-kill-task).

Rule cleanup uses the following configuration:
- `druid.coordinator.kill.rule.on`: When `true`, enables cleanup for rules records.
Expand All @@ -129,7 +128,7 @@ To prevent the configuration from being prematurely removed, wait for the dataso

Unlike other metadata records, compaction configuration records do not have a retention period set by `durationToRetain`. Druid deletes compaction configuration records at every cleanup cycle for inactive datasources, which do not have segments either used or unused.

Compaction configuration records in the `druid_config` table become eligible for deletion after all segments for the datasource have been killed by the kill task. Automated cleanup for compaction configuration requires a [kill task](#kill-task).
Compaction configuration records in the `druid_config` table become eligible for deletion after all segments for the datasource have been killed by the kill task. Automated cleanup for compaction configuration requires a [kill task](#segment-records-and-segments-in-deep-storage-kill-task).

Compaction configuration cleanup uses the following configuration:
- `druid.coordinator.kill.compaction.on`: When `true`, enables cleanup for compaction configuration records.
Expand All @@ -153,7 +152,7 @@ Datasource cleanup uses the following configuration:

You can configure the Overlord to periodically delete indexer task logs and associated metadata. During cleanup, the Overlord removes the following:
* Indexer task logs from deep storage.
* Indexer task log metadata from the tasks and tasklogs tables in [metadata storage](../configuration/index.md#metadata-storage) (named `druid_tasks` and `druid_tasklogs` by default). Druid no longer uses the tasklogs table, and the table is always empty.
* Indexer task log metadata from the tasks table in [metadata storage](../configuration/index.md#metadata-storage) (named `druid_tasks` by default).

To configure cleanup of task logs by the Overlord, set the following properties in the `overlord/runtime.properties` file.

Expand Down Expand Up @@ -188,7 +187,6 @@ druid.coordinator.kill.rule.on=false
druid.coordinator.kill.datasource.on=false
```

<a name="example"></a>
## Example configuration for automated metadata cleanup

Consider a scenario where you have scripts to create and delete hundreds of datasources and related entities a day. You do not want to fill your metadata store with leftover records. The datasources and related entities tend to persist for only one or two days. Therefore, you want to run a cleanup job that identifies and removes leftover records that are at least four days old after a seven day buffer period in case you want to recover the data. The exception is for audit logs, which you need to retain for 30 days:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,12 +121,6 @@ public Boolean perform(Task task, TaskActionToolbox toolbox)
);
}

@Override
public boolean isAudited()
{
return true;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,6 @@
import com.fasterxml.jackson.databind.ObjectMapper;
import org.apache.druid.indexing.common.task.IndexTaskUtils;
import org.apache.druid.indexing.common.task.Task;
import org.apache.druid.indexing.overlord.TaskStorage;
import org.apache.druid.java.util.common.ISE;
import org.apache.druid.java.util.common.jackson.JacksonUtils;
import org.apache.druid.java.util.emitter.EmittingLogger;
import org.apache.druid.java.util.emitter.service.ServiceMetricEvent;
Expand All @@ -37,45 +35,21 @@ public class LocalTaskActionClient implements TaskActionClient
private static final EmittingLogger log = new EmittingLogger(LocalTaskActionClient.class);

private final Task task;
private final TaskStorage storage;
private final TaskActionToolbox toolbox;
private final TaskAuditLogConfig auditLogConfig;

public LocalTaskActionClient(
Task task,
TaskStorage storage,
TaskActionToolbox toolbox,
TaskAuditLogConfig auditLogConfig
TaskActionToolbox toolbox
)
{
this.task = task;
this.storage = storage;
this.toolbox = toolbox;
this.auditLogConfig = auditLogConfig;
}

@Override
public <RetType> RetType submit(TaskAction<RetType> taskAction)
{
log.debug("Performing action for task[%s]: %s", task.getId(), taskAction);

if (auditLogConfig.isEnabled() && taskAction.isAudited()) {
// Add audit log
try {
final long auditLogStartTime = System.currentTimeMillis();
storage.addAuditLog(task, taskAction);
emitTimerMetric("task/action/log/time", taskAction, System.currentTimeMillis() - auditLogStartTime);
}
catch (Exception e) {
final String actionClass = taskAction.getClass().getName();
log.makeAlert(e, "Failed to record action in audit log")
.addData("task", task.getId())
.addData("actionClass", actionClass)
.emit();
throw new ISE(e, "Failed to record action [%s] in audit log", actionClass);
}
}

final long performStartTime = System.currentTimeMillis();
final RetType result = performAction(taskAction);
emitTimerMetric("task/action/run/time", taskAction, System.currentTimeMillis() - performStartTime);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,27 +21,22 @@

import com.google.inject.Inject;
import org.apache.druid.indexing.common.task.Task;
import org.apache.druid.indexing.overlord.TaskStorage;

/**
*/
public class LocalTaskActionClientFactory implements TaskActionClientFactory
{
private final TaskStorage storage;
private final TaskActionToolbox toolbox;
private final TaskAuditLogConfig auditLogConfig;

@Inject
public LocalTaskActionClientFactory(TaskStorage storage, TaskActionToolbox toolbox, TaskAuditLogConfig auditLogConfig)
public LocalTaskActionClientFactory(TaskActionToolbox toolbox)
{
this.storage = storage;
this.toolbox = toolbox;
this.auditLogConfig = auditLogConfig;
}

@Override
public TaskActionClient create(Task task)
{
return new LocalTaskActionClient(task, storage, toolbox, auditLogConfig);
return new LocalTaskActionClient(task, toolbox);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,6 @@ public List<TaskLock> perform(Task task, TaskActionToolbox toolbox)
return toolbox.getTaskLockbox().findLocksForTask(task);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,6 @@ public Void perform(Task task, TaskActionToolbox toolbox)
return null;
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,4 @@ public Integer perform(Task task, TaskActionToolbox toolbox)
.markSegmentsAsUnusedWithinInterval(dataSource, interval);
}

@Override
public boolean isAudited()
{
return true;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -64,12 +64,6 @@ public Boolean perform(Task task, TaskActionToolbox toolbox)
return toolbox.getSupervisorManager().resetSupervisor(dataSource, resetMetadata);
}

@Override
public boolean isAudited()
{
return true;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,6 @@ public Set<DataSegment> perform(Task task, TaskActionToolbox toolbox)
.retrieveSegmentsById(dataSource, segmentIds);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public boolean equals(Object o)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,12 +101,6 @@ public List<DataSegment> perform(Task task, TaskActionToolbox toolbox)
.retrieveUnusedSegmentsForInterval(dataSource, interval, versions, limit, maxUsedStatusLastUpdatedTime);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,12 +186,6 @@ private Set<DataSegment> retrieveUsedSegments(TaskActionToolbox toolbox)
.retrieveUsedSegmentsForIntervals(dataSource, intervals, visibility);
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public boolean equals(Object o)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -386,12 +386,6 @@ private SegmentIdWithShardSpec tryAllocate(
}
}

@Override
public boolean isAudited()
{
return false;
}

@Override
public String toString()
{
Expand Down

This file was deleted.

Loading

0 comments on commit f9742bf

Please sign in to comment.