From 1a6242500171374e2057aeae70ec9eec1870ebb4 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Tue, 25 Jun 2019 17:20:59 -0300 Subject: [PATCH 01/10] initial commit --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 82 +++++++++++++++++++++++ 1 file changed, 82 insertions(+) create mode 100644 1-Draft/RFCNNNN-Concurrent-Collections.md diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md new file mode 100644 index 00000000..f882ba35 --- /dev/null +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -0,0 +1,82 @@ +--- +RFC: RFCnnnn +Author: Kirk Munro +Status: Draft +SupercededBy: +Version: 0.1 +Area: Parser +Comments Due: July 25, 2019 +Plan to implement: Yes +--- + +# Closures for concurrent collections + +PowerShell is getting more and more support for multi-threaded use, such as: + +* the PSThreadJob module +* `S.M.A.PowerShell.InvokeAsync` and `S.M.A.PowerShell.StopAsync` +* parallelization in `ForEach-Object` (related RFC: #194) + +With these either available now or being made available soon, it would be helpful if scripters could easily create concurrent alternatives to arrays (`@()`) and hashtables (`@{}`). + +## Motivation + + As a scripter, + I can easily define a ConcurrentBag, + so that I can collect unordered objects from multiple threads in thread-safe manner. + + As a scripter, + I can easily define a ConcurrentDictionary, + so that I can store objects relative to keys in a dictionary in thread-safe manner. + +## User Experience + +```powershell +# Define a ConcurrentBag +$collection = ~@() + +# Define a ConcurrentDictionary +$dictionary = ~@{} + +# Retrieve a bunch of logs from remote computers +$computers = 's1','s2','s3' +$computers | Start-ThreadJob { + # Retrieve the data and store it in a thread-safe collection + $collection += Get-LogData -ComputerName $_ + # Retrieve the data and store it in a thread-safe dictionary + $dictionary[$_] = Get-LogData +} + +# Process the collection as an array +$collection.ToArray() + +# Process items in the dictionary +foreach ($key in $dictionary.Keys) { + # Do something with each value + $dictionary[$key] +} +``` + +```output +The data gathered in the $collection collection. + +The data gathered in each key in the $dictionary collection. +``` + +## Specification + +* define `~@()` enclosures as `System.Collections.Concurrent.ConcurrentBag` +* define `~@{}` enclosures as `System.Collections.Concurrent.ConcurrentDictionary` +* ensure that the `+=` operator adds data to a `ConcurrentBag` (right now using that operator with `ConcurrentBag` results in it being converted to an array of objects instead) +* show a warning (suggestion) users if they use `[]` index operators with a `ConcurrentBag` indicating that the `ConcurrentBag` collection is unordered and therefore cannot be used with index operators, and suggesting that users should use the `.ToArray()` method of `ConcurrentBag` once the collection has been populated if they want to convert it to an array and process items using their index. + +## Alternate Proposals and Considerations + +### Breaking changes + +There are technically two breaking changes in this RFC: + +1. Adding items to `ConcurrentBag` with `+=` results in the collection being converted into an array. +1. You can create a command named `~@` in PowerShell, so anyone using that command name will break if we add `~@()` as enclosures. + +I believe both of these are low-risk breaking changes that are worth making so that we can have easier concurrent collection support in PowerShell moving forward. From 3f557dd7117aac1479a77e16d7929ac1079070a3 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Tue, 25 Jun 2019 17:40:35 -0300 Subject: [PATCH 02/10] broke paragraphs into lines; added consideration --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 40 +++++++++++++++++++---- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index f882ba35..e5ee7e24 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -17,7 +17,9 @@ PowerShell is getting more and more support for multi-threaded use, such as: * `S.M.A.PowerShell.InvokeAsync` and `S.M.A.PowerShell.StopAsync` * parallelization in `ForEach-Object` (related RFC: #194) -With these either available now or being made available soon, it would be helpful if scripters could easily create concurrent alternatives to arrays (`@()`) and hashtables (`@{}`). +With these either available now or being made available soon, it would be +helpful if scripters could easily create concurrent alternatives to arrays +(`@()`) and hashtables (`@{}`). ## Motivation @@ -67,8 +69,15 @@ The data gathered in each key in the $dictionary collection. * define `~@()` enclosures as `System.Collections.Concurrent.ConcurrentBag` * define `~@{}` enclosures as `System.Collections.Concurrent.ConcurrentDictionary` -* ensure that the `+=` operator adds data to a `ConcurrentBag` (right now using that operator with `ConcurrentBag` results in it being converted to an array of objects instead) -* show a warning (suggestion) users if they use `[]` index operators with a `ConcurrentBag` indicating that the `ConcurrentBag` collection is unordered and therefore cannot be used with index operators, and suggesting that users should use the `.ToArray()` method of `ConcurrentBag` once the collection has been populated if they want to convert it to an array and process items using their index. +* ensure that the `+=` operator adds data to a `ConcurrentBag` (right now using +that operator with `ConcurrentBag` results in it being converted to an array of +objects instead) +* show a warning (suggestion) users if they use `[]` index operators with a +`ConcurrentBag` indicating that the `ConcurrentBag` collection is unordered and +therefore cannot be used with index operators, and suggesting that users should +use the `.ToArray()` method of `ConcurrentBag` once the collection has been +populated if they want to convert it to an array and process items using their +index. ## Alternate Proposals and Considerations @@ -76,7 +85,26 @@ The data gathered in each key in the $dictionary collection. There are technically two breaking changes in this RFC: -1. Adding items to `ConcurrentBag` with `+=` results in the collection being converted into an array. -1. You can create a command named `~@` in PowerShell, so anyone using that command name will break if we add `~@()` as enclosures. +1. Adding items to `ConcurrentBag` with `+=` results in the collection being +converted into an array. +1. You can create a command named `~@` in PowerShell, so anyone using that +command name will break if we add `~@()` as enclosures. -I believe both of these are low-risk breaking changes that are worth making so that we can have easier concurrent collection support in PowerShell moving forward. +I believe both of these are low-risk breaking changes that are worth making so +that we can have easier concurrent collection support in PowerShell moving +forward. + +### Alternative syntax + +To simplify the syntax somewhat, we could implement `~()` for `ConcurrentBag` +and `~{}` for `ConcurrentDictionary`. + +On the plus side, these are shorter and therefore easier to type. Also, by +removing the `@` character from `~@()`, it becomes a little less like an array +(which it is not, since arrays are ordered and you cannot index into a +`ConcurrentBag`, which is unordered). + +On the downside, the potential for the second breaking change increases +because someone may have created an alias for `~` that means something to them. +That would only break, however, if they were invoking that command with round +brackets with at least one value inside (e.g. `~('something')`). From 4dd44158edad132fec37f2eea93742f346212645 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Tue, 25 Jun 2019 22:50:26 -0300 Subject: [PATCH 03/10] Update RFCNNNN-Concurrent-Collections.md --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index e5ee7e24..8b8c4288 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -15,7 +15,7 @@ PowerShell is getting more and more support for multi-threaded use, such as: * the PSThreadJob module * `S.M.A.PowerShell.InvokeAsync` and `S.M.A.PowerShell.StopAsync` -* parallelization in `ForEach-Object` (related RFC: #194) +* parallelization in `ForEach-Object` ([see related RFC 194](https://github.com/PowerShell/PowerShell-RFC/pull/194)) With these either available now or being made available soon, it would be helpful if scripters could easily create concurrent alternatives to arrays From 82ca02129ce34fb1fed8aa74f897d142c5a333b8 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Tue, 25 Jun 2019 22:51:12 -0300 Subject: [PATCH 04/10] Update RFCNNNN-Concurrent-Collections.md --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index 8b8c4288..ffc5faa0 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -24,11 +24,11 @@ helpful if scripters could easily create concurrent alternatives to arrays ## Motivation As a scripter, - I can easily define a ConcurrentBag, + I can easily define and work with a ConcurrentBag, so that I can collect unordered objects from multiple threads in thread-safe manner. As a scripter, - I can easily define a ConcurrentDictionary, + I can easily define and work with a ConcurrentDictionary, so that I can store objects relative to keys in a dictionary in thread-safe manner. ## User Experience From 5b7cea2802682cb9d832b0229533d0a4dc425541 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Wed, 26 Jun 2019 10:27:26 -0300 Subject: [PATCH 05/10] Update RFCNNNN-Concurrent-Collections.md --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index ffc5faa0..1038b62c 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -68,7 +68,7 @@ The data gathered in each key in the $dictionary collection. ## Specification * define `~@()` enclosures as `System.Collections.Concurrent.ConcurrentBag` -* define `~@{}` enclosures as `System.Collections.Concurrent.ConcurrentDictionary` +* define `~@{}` enclosures as `System.Collections.Concurrent.ConcurrentDictionary` * ensure that the `+=` operator adds data to a `ConcurrentBag` (right now using that operator with `ConcurrentBag` results in it being converted to an array of objects instead) From 8a998196ac25a9d5dcb35b4d0e3bf9bef8069298 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Wed, 26 Jun 2019 10:29:43 -0300 Subject: [PATCH 06/10] Update RFCNNNN-Concurrent-Collections.md --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index 1038b62c..b25f1fd1 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -86,13 +86,15 @@ index. There are technically two breaking changes in this RFC: 1. Adding items to `ConcurrentBag` with `+=` results in the collection being -converted into an array. +converted into an array in current PowerShell versions. 1. You can create a command named `~@` in PowerShell, so anyone using that command name will break if we add `~@()` as enclosures. I believe both of these are low-risk breaking changes that are worth making so that we can have easier concurrent collection support in PowerShell moving -forward. +forward. Further, the first one could be identified as a bug and is likely not +what you want, and the second one is only breaking if you happen to be invoking +your `~@` command using parentheses with at least one value inside (e.g. `@(1)`). ### Alternative syntax From 890c22814c9e238cc7aade3d4bddbd61e5b31ac3 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Wed, 26 Jun 2019 10:31:31 -0300 Subject: [PATCH 07/10] Update RFCNNNN-Concurrent-Collections.md --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index b25f1fd1..e821ab4c 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -107,6 +107,7 @@ removing the `@` character from `~@()`, it becomes a little less like an array `ConcurrentBag`, which is unordered). On the downside, the potential for the second breaking change increases -because someone may have created an alias for `~` that means something to them. +because it is more likely that someone has created an alias of `~` than `~@`. That would only break, however, if they were invoking that command with round -brackets with at least one value inside (e.g. `~('something')`). +brackets with at least one value inside (e.g. `~('something')`), so it would +still be a very low risk breaking change. From b952d2f01bf3712c885acc342899e6d85d21b32f Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Thu, 27 Jun 2019 10:21:24 -0300 Subject: [PATCH 08/10] Update 1-Draft/RFCNNNN-Concurrent-Collections.md Co-Authored-By: Ilya --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index e821ab4c..5bb5b731 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -44,7 +44,7 @@ $dictionary = ~@{} $computers = 's1','s2','s3' $computers | Start-ThreadJob { # Retrieve the data and store it in a thread-safe collection - $collection += Get-LogData -ComputerName $_ + $using:collection += Get-LogData -ComputerName $_ # Retrieve the data and store it in a thread-safe dictionary $dictionary[$_] = Get-LogData } From 68c5f65206a0cbdd7287d7f8c8e282005d319a79 Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Thu, 27 Jun 2019 11:51:55 -0300 Subject: [PATCH 09/10] add alternate considerations --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index e5ee7e24..257901a2 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -94,7 +94,7 @@ I believe both of these are low-risk breaking changes that are worth making so that we can have easier concurrent collection support in PowerShell moving forward. -### Alternative syntax +### Alternative syntaxes To simplify the syntax somewhat, we could implement `~()` for `ConcurrentBag` and `~{}` for `ConcurrentDictionary`. @@ -107,4 +107,16 @@ removing the `@` character from `~@()`, it becomes a little less like an array On the downside, the potential for the second breaking change increases because someone may have created an alias for `~` that means something to them. That would only break, however, if they were invoking that command with round -brackets with at least one value inside (e.g. `~('something')`). +brackets with at least one value inside (e.g. `~('something')` + +We could also flip the enclosures, defining `@~()` as a `ConcurrentBag` and +`@~{}` as a `ConcurrentDictionary`. The advantage here is that they result in +parser errors, so wouldn't be a breaking change. The disadvantage is that the +syntax is obscure and really doesn't present itself very well at all. + +### Related discussion + +There is a long running thread where some community members have been +[discussing a new enclosure for `List`](https://github.com/PowerShell/PowerShell/issues/5643). Those thoughts should be +reviewed and considered when looking at this proposal so that we come up with +a unified strategy to move forward with. From 9942cd862d7a5c3b4868adfd4dca25890466ab6e Mon Sep 17 00:00:00 2001 From: Kirk Munro Date: Thu, 27 Jun 2019 12:37:32 -0300 Subject: [PATCH 10/10] add alternate proposal --- 1-Draft/RFCNNNN-Concurrent-Collections.md | 33 +++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/1-Draft/RFCNNNN-Concurrent-Collections.md b/1-Draft/RFCNNNN-Concurrent-Collections.md index 4ad69c03..cbf96ebb 100644 --- a/1-Draft/RFCNNNN-Concurrent-Collections.md +++ b/1-Draft/RFCNNNN-Concurrent-Collections.md @@ -117,6 +117,39 @@ We could also flip the enclosures, defining `@~()` as a `ConcurrentBag` and parser errors, so wouldn't be a breaking change. The disadvantage is that the syntax is obscure and really doesn't present itself very well at all. +### Don't add the enclosures because we have using + +Since scripters can use the `using` statement to make it easier to work with +namespaces, they could do the following: + +```PowerShell +using namespace System.Collections.Concurrent + +$x = [ConcurrentBag[PSObject]]::new() +``` + +That works, but it lacks discovery of enclosures that should be used when +working with threads. One of the advantages of new enclosures is not simply +syntactic sugar. It's also beneficial for discovery. Users could learn about +different types of collections in one place, and learn when it is appropriate +to use one type over another. + +The other key advantage to using enclosures, particularly for +`ConcurrentDictionary`, is ease of use in pre-population of the data. This +syntax is familiar and very straightforward: + +```PowerShell +$x = ~@{ + PropOne = 1 + PropTwo = 2 +} +``` + +The alternative is having to use the `Add` method on the collection to populate +it. You could use the index operator to create keys and assign values, but when +you are assigning multiple values at once, that is much less PowerShell-like +than simply defining the dictionary and assigning values to it in one step. + ### Related discussion There is a long running thread where some community members have been