Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get instance id for desired control-queue(s) #1069

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
93 changes: 93 additions & 0 deletions Test/DurableTask.AzureStorage.Tests/TestPartitionIndex.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
// ----------------------------------------------------------------------------------
// Copyright Microsoft Corporation
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// ----------------------------------------------------------------------------------

namespace DurableTask.AzureStorage.Tests
{
using System;
using System.Collections.Generic;
using System.Data;
using System.Threading;
using Microsoft.VisualStudio.TestTools.UnitTesting;

[TestClass]
public class TestPartitionIndex
{
private AzureStorageOrchestrationService azureStorageOrchestrationService;
private AzureStorageOrchestrationServiceSettings settings;
private int partitionCount = 4;
private Dictionary<string, int> controlQueueNumberToNameMap;
private CancellationTokenSource cancellationTokenSource;
private const string TaskHub = "taskHubName";

[TestInitialize]
public void Initialize()
{
cancellationTokenSource = new CancellationTokenSource();

settings = new AzureStorageOrchestrationServiceSettings()
{
StorageConnectionString = TestHelpers.GetTestStorageAccountConnectionString(),
TaskHubName = TaskHub,
PartitionCount = partitionCount
};

azureStorageOrchestrationService = new AzureStorageOrchestrationService(settings);

controlQueueNumberToNameMap = new Dictionary<string, int>();

for (int i = 0; i < partitionCount; i++)
{
var controlQueueName = AzureStorageOrchestrationService.GetControlQueueName(settings.TaskHubName, i);
controlQueueNumberToNameMap[controlQueueName] = i;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we still using this in the new tests? No, right?


[TestMethod]
[DataRow(20, false)]
[DataRow(20, true)]
public void GetPartitionIndexTest(int maxInstanceIdCount, bool enableExplicitPartitionPlacement)
{
settings.EnableExplicitPartitionPlacement = enableExplicitPartitionPlacement;

for (uint instanceIdSuffix = 0; instanceIdSuffix < settings.PartitionCount * 4; instanceIdSuffix++)
{
Dictionary<uint, int> indexNumberToCount = new Dictionary<uint, int>();

for (uint indexCount = 0; indexCount < settings.PartitionCount; indexCount++)
{
indexNumberToCount[indexCount] = 0;
}

for (int instanceCount = 0; instanceCount < maxInstanceIdCount; instanceCount++)
{
var instanceIdPrefix = Guid.NewGuid().ToString();

var instanceId = $"{instanceIdPrefix}!{instanceIdSuffix}";

var partitionIndex = azureStorageOrchestrationService.GetPartitionIndex(instanceId);

indexNumberToCount[partitionIndex]++;
}

if (enableExplicitPartitionPlacement)
{
Assert.AreEqual(indexNumberToCount[(uint)(instanceIdSuffix % settings.PartitionCount)], maxInstanceIdCount);
}
else
{
Assert.AreNotEqual(indexNumberToCount[(uint)(instanceIdSuffix % settings.PartitionCount)], maxInstanceIdCount);
}
}
}
}
}
26 changes: 24 additions & 2 deletions src/DurableTask.AzureStorage/AzureStorageOrchestrationService.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2048,9 +2048,9 @@ public Task<string> DownloadBlobAsync(string blobUri)

// TODO: Change this to a sticky assignment so that partition count changes can
// be supported: https://github.com/Azure/azure-functions-durable-extension/issues/1
async Task<ControlQueue?> GetControlQueueAsync(string instanceId)
internal async Task<ControlQueue?> GetControlQueueAsync(string instanceId)
{
uint partitionIndex = Fnv1aHashHelper.ComputeHash(instanceId) % (uint)this.settings.PartitionCount;
uint partitionIndex = GetPartitionIndex(instanceId);
string queueName = GetControlQueueName(this.settings.TaskHubName, (int)partitionIndex);

ControlQueue cachedQueue;
Expand All @@ -2075,6 +2075,28 @@ public Task<string> DownloadBlobAsync(string blobUri)
return cachedQueue;
}

internal uint GetPartitionIndex(string instanceId)
{
uint totalPartitions = (uint)this.settings.PartitionCount;

int placementSeparatorPosition = instanceId.LastIndexOf('!');

// if the instance id ends with !nnn, where nnn is an unsigned number, it indicates explicit partition placement
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test that documents the behavior if the customer uses an instanceID with multiple ! in there? Say instanceID "A!1!B!3` should probably map to partition "3", right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add a test that checks that instanceID myinstanceID!NotANumber does not trigger any errors / that it correctly ignores the explicit placement logic.

if (
this.settings.EnableExplicitPartitionPlacement
&& placementSeparatorPosition != -1
&& uint.TryParse(instanceId.Substring(placementSeparatorPosition + 1), out uint index))
{
var partitionId = index % totalPartitions;
return (uint)partitionId;
Comment on lines +2090 to +2091
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this allows for the following scenario:
Assume we have 4 partitions,
And user creates instanceID "abc!7", they will still land on some queue, but it won't be the 7th queue, because that doesn't exist.

From first principles, I would think we'd want to error out in this case. But it seems this behavior is consistent with Netherite. I would prefer to throw in this case, but curious to know what others think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pasaini-microsoft - now that I think about it, it may be a good idea to keep the behavior you implemented. If in the future, we make the partitionCount something that users can change 'on the fly' (not possible today, but I'm working to make this happen), then this behavior would be resilient to changes in the number of partitions.

Let's keep this behavior for now but let's also try to emit a warning for when the total number of partitions is less than the customer's specified target number. That will help notify the customer that something possibly unintuitive is taking place. Thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually in favor of not generating a warning. The reason is that the warning would actually fire almost constantly in all the applications where I have used this.

The expected use is that applications want to distribute things over the queues but dont actually know or care how many queues there are (e.g. like partition keys in Azure Storage).

Copy link
Collaborator

@jviau jviau Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @sebastianburckhardt - I feel this will be more noisy than helpful. I would only consider it if this leads to undefined behavior. But it isn't, it is by design hence the % operator. We need to make sure this behavior is well documented.

Copy link
Collaborator

@davidmrdavid davidmrdavid Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's ignore the warning, I'm convinced. Agreed on the need to document it. We can do that documentation in an azure-docs PR.

}
else
{
return Fnv1aHashHelper.ComputeHash(instanceId) % totalPartitions;

}
}

/// <summary>
/// Disposes of the current object.
/// </summary>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -318,5 +318,11 @@ internal LogHelper Logger
/// Consumers that require separate dispatch (such as the new out-of-proc v2 SDKs) must set this to true.
/// </summary>
public bool UseSeparateQueueForEntityWorkItems { get; set; } = false;

/// <summary>
/// Enabled explicit placement of instance to parition id.
/// if the instance id ends with !nnn, where nnn is an unsigned number, it indicates explicit partition placement
/// </summary>
pasaini-microsoft marked this conversation as resolved.
Show resolved Hide resolved
public bool EnableExplicitPartitionPlacement { get; set; } = false;
}
}