diff --git a/exercises/practice/reverse-string/.approaches/additional-approaches/content.md b/exercises/practice/reverse-string/.approaches/additional-approaches/content.md new file mode 100644 index 0000000000..1ff735608c --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/additional-approaches/content.md @@ -0,0 +1,159 @@ +# Additional Approaches that are Further Afield + +Below are some interesting strategies that are distinct from the canonical approaches that have already been discussed. +While they do not offer particular performance boosts over the canonical approaches (_and some offer very large penalties_), they do explore interesting corners of Python. + + +## Convert the Input to a UTF-8 bytearray and use a Sliding Window to Reverse + + +```python +def reverse(text): + + # Create bytearrays for input and output. + given, output = bytearray(text.encode("utf-8")), bytearray(len(text)) + index = 0 + LENGTH_MASK = 0xE0 # this is 0b11110000 (binary) or 224 (decimal) + + # Loop through the input bytearray. + while index < len(given): + + #Either the len is 1 or it is calculated by counting the bits after masking. + seq_len = (not(given[index] >> 7) or + (given[index] & LENGTH_MASK).bit_count()) + + #Calculate the index start. + location = index + seq_len +1 + + #Prepend the byte segment to the output bytearray + output[-location:-index or None] = given[index:index + seq_len] + + #Increment the index count or slide the 'window'. + index += seq_len + + #Decode output to UTF-8 string and return. + return output.decode("utf-8") + +``` + +This strategy encodes the string into a UTF-8 [`bytearray`][bytearray]. +It then uses a `while` loop to iterate through the text, calculating the length of a sequence (or 'window') to slice from 'given' and prepend to 'output'. + The 'index' counter is then incremented by the length of the 'window'. + Once the 'index' is greater than the length of 'given', the 'output' bytearray is decoded into a UTF-8 string and returned. + This is (_almost_) the same set of operations as described in the code below, but operating on bytes in a bytearray, as opposed to text/codepoints in a `list` - although this strategy does not use `list.pop()` (_bytearray objects do not have a pop method_). + + This uses `O(n)` space for the output array. +It incurs additional runtime overhead by _prepending_ to the output array, which is an expensive operation that forces many repeated shifts. +Encoding to bytes and decoding to codepoints further slow this approach. + + +## Convert the Input to a list and use a While Loop to Pop and Append to a Second List + + +```python +def reverse(text): + codepoints, stniopedoc = list(text), [] + + while codepoints: + stniopedoc.append(codepoints.pop()) + + return ''.join(stniopedoc) +``` + +This strategy uses two lists. +One `list` for the codepoints in the text, and one to hold the codepoints in reverse order. +First, the input text is turned into a the 'codepoints' `list`, and iterated over. +Each codepoint is `pop()`ped from 'codepoints' and appended to the 'stniopedoc' `list`. +Finally, 'stniopedoc' is joined via `str.join()` to create the reversed string. + +While this is a straightforward and readable approach, it creates both memory and performance overhead, due to the creation of the lists and the use of `join()`. +This is much faster than the bytearray strategy or using string concatenation, but is still almost slower than the slicing strategy. +It also takes up `O(n)` auxiliary space with the stniopedoc list. + + + +## Using Recursion Instead of a Loop + + +```python +def reverse(text): + if len(text) == 0: + return text + else: + return reverse(text[1:]) + text[0] +``` + +This strategy uses a slice to copy all but the leftmost part of the string, concatenating the codepoint at the first index to the end. +The function then calls itself with the (now shorter) text slice. +This slice + concatenation process continues until the `len()` is 0, and the reversed text is returned up the call stack. +This is the same as iterating over the string backward in a `loop`, appending each codepoint to a new string, and has identical time complexity. +It also uses O(n) space, with the space being successive calls on the call stack. + +Because each recursive call is placed on the stack and Python limits recursive calls to a max of 1000, this code produces a `maximum recursion depth exceeded` error for any string longer than ~999 characters. + + +## Using `map()` and `lambbda` with `Join()` Instead of a Loop + +```python +def reverse(text): + return "".join(list(map(lambda x: text[(-x-1)], range(len(text))))) +``` + +This variation uses the built-in `map()` and a `lambda` to iterate over the string backward, constructing a `list`. +The `list` is then fed to `str.join()`, which unpacks it and turns it into a string. +This is a very non-performant way to walk the string backwards, and also incurs extra overhead due to the unneeded construction of an intermediary `list`. + +`map()` can instead be directly fed to `join()`, which improves performance to `O(n)`: + +```python +def reverse(text): + return "".join(map(lambda x: text[(-x-1)], range(len(text)))) +``` + + +## Using a `lambda` that returns a Reverse Sequence Slice + + +```python +reverse = lambda text: text[::-1] +``` + + +This strategy assigns the name "reverse" to a `lambda` that produces a reverse slice of the string. +This looks quite clever and is shorter than a "traditional" function, but it is far from obvious that this line defines a callable named "reverse" that returns a reversed string. +While this code compiles to the same function definition as the first approach article, it is not clear to many programmers who might read through this code that they could call `reverse('some_string')` the way they could call other functions. + + +This has the added disadvantage of creating troubleshooting issues since any errors will be attributed to `lambda` in the stack trace and not associated with an explicit function named `reverse`. +Help calls and `__repr__` calls are similarly affected. +This is not the intended use of `lambdas` (_which are for unnamed or anonymous functions_), nor does it confer any sort of performance boost over other methods, but _does_ create readability issues with anyone unfamiliar with `lambda` syntax and compilation. + + +## Timings vs Reverse Slice + + +As a (very) rough comparison, below is a timing table for these functions vs the canonical reverse slice: + + +| **string lengths >>>>** | Str Len: 5 | Str Len: 11 | Str Len: 22 | Str Len: 52 | Str Len: 68 | Str Len: 86 | Str Len: 142 | Str Len: 1420 | Str Len: 14200 | Str Len: 142000 | +|------------------------- |------------ |------------- |------------- |------------- |------------- |------------- |-------------- |--------------- |---------------- |----------------- | +| reverse slice | 1.66e-07 | 1.75e-07 | 1.79e-07 | 2.03e-07 | 2.22e-07 | 2.38e-07 | 3.63e-07 | 1.44e-06 | 1.17e-05 | 1.16e-04 | +| reverse lambda | 1.68e-07 | 1.72e-07 | 1.85e-07 | 2.03e-07 | 2.44e-07 | 2.35e-07 | 3.65e-07 | 1.47e-06 | 1.25e-05 | 1.18e-04 | +| reverse dual lists | 9.17e-07 | 1.56e-06 | 2.70e-06 | 5.69e-06 | 8.30e-06 | 1.07e-05 | 1.80e-05 | 1.48e-04 | 1.50e-03 | 1.53e-02 | +| reverse recursive | 8.74e-07 | 1.90e-06 | 4.02e-06 | 8.97e-06 | 1.24e-05 | 1.47e-05 | 3.34e-05 | --- | --- | --- | +| reverse bytes | 1.92e-06 | 3.82e-06 | 7.36e-06 | 1.65e-05 | 2.17e-05 | 2.71e-05 | 4.47e-05 | 5.17e-04 | 6.10e-03 | 2.16e-01 | + + +As you can see, the reverse using two lists and the reverse using a bytearray are orders of magnitude slower than using a reverse slice. +For the largest inputs measured, the dual list solution was almost 55x slower, and the bytearray solution was almost 1800x slower. +Timings for strings over 142 characters could not be run for the recursive strategy, due to Python's 1000 call recursion limit. + + +Measurements were taken on a 3.1 GHz Quad-Core Intel Core i7 Mac running MacOS Ventura. +Tests used `timeit.Timer.autorange()`, repeated 3 times. +Time is reported in seconds taken per string after calculating the 'best of' time. +The [`timeit`][timeit] module docs have more details, and [note.nkmk.me][note_nkmk_me] has a nice summary of methods. + +[bytearray]: https://docs.python.org/3/library/stdtypes.html#bytearray +[timeit]: https://docs.python.org/3/library/timeit.html#python-interface +[note_nkmk_me]: https://note.nkmk.me/en/python-timeit-measure/ diff --git a/exercises/practice/reverse-string/.approaches/additional-approaches/snippet.txt b/exercises/practice/reverse-string/.approaches/additional-approaches/snippet.txt new file mode 100644 index 0000000000..9bb10135a0 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/additional-approaches/snippet.txt @@ -0,0 +1,8 @@ + given, output = bytearray(text.encode("utf-8")), bytearray(len(given)) + index, LENGTH_MASK = 0, 0xE0 # 0b11110000 or 224 + while index < len(given): + seq_len = not(given[index] >> 7) or (given[index] & LENGTH_MASK).bit_count() + location = index + seq_len +1 + output[-location:-index or None] = given[index:index + seq_len] + index += seq_len + return output.decode("utf-8") \ No newline at end of file diff --git a/exercises/practice/reverse-string/.approaches/backward-iteration-with-range/content.md b/exercises/practice/reverse-string/.approaches/backward-iteration-with-range/content.md new file mode 100644 index 0000000000..7b1ddd5b77 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/backward-iteration-with-range/content.md @@ -0,0 +1,78 @@ +## Backward Iteration with Range + + +```python +def reverse(text): + output = "" + for index in range(len(text) - 1, -1, -1): #For 'Robot', this is 4 (start) 0 (stop), iterating (4,3,2,1,0) + output += text[index] + return output +``` + + +These variations all use the built-in [`range()`][range] object to iterate over the input text from right --> left, adding each codepoint to the output string. +This is the same as iterating over the text backward using one or more `index` variables, but incurs slightly less overhead by substituting `range()` for them. +Note that the code above also avoids _prepending_ to the output string. + +For very long strings, this code will still degrade to `O(n**2)` performance, due to the use of string concatenation. +Using `''.join()` here can avoid heavy concatenation penalty as strings grow longer and the CPython string append optimization mentioned in the [iteration and concatenation][approach-iteration-and-concatenation] approach breaks down. + + +## Variation #1: Forward Iteration in Range, Negative Index + + +```python +def reverse(text): + output = '' + + for index in range(1, len(text) + 1): + output += text[-index] + return output +``` + + +This version iterates left --> right using a positive `range()` and then _prepends to the string_ by using a negative index for the codepoint. +This has the same faults as variation #1, with the added cost of prepending via concatenation. + + +## Variation #2: Feed Range and the Index into Join() + +```python +def reverse(text): + return "".join(text[index] for index in range(len(text)-1, -1, -1)) + ``` + + This version omits the intermediary output string, and uses `"".join()` directly in the return. + Within the `join()` call, `range()` is used with a negative step to iterate over the input text backward. + + This strategy avoids the penalties of string concatenation with an intermediary string. + It is still `O(n)` in time complexity, and is slower than reverse indexing due to the calls to `join()`, `len()` and `range()`, and the creation of the generator expression. + + Because of the aforementioned string append optimization in CPython, this approach will benchmark slower for strings under length 1000, but becomes more and more efficient as the length of the string grows. + Since the CPython optimization is not stable nor transferable to other versions of Python, using `join()` by default is recommended in any situation where the string concatenation is not strictly repetition and length constrained. + + +## Timings vs Reverse Slice + + + As a (very) rough comparison, below is a timing table for these functions vs the canonical reverse slice: + + + +| **string length >>>>** | 5 | 11 | 22 | 52 | 66 | 86 | 142 | 1420 | 14200 | 142000 | +|------------------------ |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: | +| reverse slice | 1.68e-07 | 1.74e-07 | 1.83e-07 | 2.07e-07 | 2.14e-07 | 2.29e-07 | 3.51e-07 | 1.50e-06 | 1.19e-05 | 1.17e-04 | +| reverse negative range | 5.89e-07 | 9.93e-07 | 1.78e-06 | 3.69e-06 | 4.71e-06 | 5.83e-06 | 9.61e-06 | 1.39e-04 | 1.46e-03 | 1.81e-02 | +| reverse positive range | 6.20e-07 | 1.14e-06 | 2.23e-06 | 4.54e-06 | 5.74e-06 | 7.38e-06 | 1.20e-05 | 1.70e-04 | 1.75e-03 | 2.07e-02 | +| reverse range and join | 8.90e-07 | 1.31e-06 | 2.14e-06 | 4.15e-06 | 5.22e-06 | 6.57e-06 | 1.06e-05 | 1.05e-04 | 1.04e-03 | 1.07e-02 | + + +Measurements were taken on a 3.1 GHz Quad-Core Intel Core i7 Mac running MacOS Ventura. +Tests used `timeit.Timer.autorange()`, repeated 3 times. +Time is reported in seconds taken per string after calculating the 'best of' time. +The [`timeit`][timeit] module docs have more details, and [note.nkmk.me][note_nkmk_me] has a nice summary of methods. + +[approach-iteration-and-concatenation]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/iteration-and-concatenation +[note_nkmk_me]: https://note.nkmk.me/en/python-timeit-measure/ +[range]: https://docs.python.org/3/library/stdtypes.html#range +[timeit]: https://docs.python.org/3/library/timeit.html#python-interface diff --git a/exercises/practice/reverse-string/.approaches/backward-iteration-with-range/snippet.txt b/exercises/practice/reverse-string/.approaches/backward-iteration-with-range/snippet.txt new file mode 100644 index 0000000000..cdb261d85a --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/backward-iteration-with-range/snippet.txt @@ -0,0 +1,5 @@ +def reverse(text): + new_word = "" + for index in range(len(text) - 1, -1, -1): + new_word += text[index] + return new_word diff --git a/exercises/practice/reverse-string/.approaches/built-in-list-reverse/content.md b/exercises/practice/reverse-string/.approaches/built-in-list-reverse/content.md new file mode 100644 index 0000000000..b195d099a5 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/built-in-list-reverse/content.md @@ -0,0 +1,83 @@ +# Make the Input Text a List and Use list.reverse() to Reverse In-Place + + +```python +def reverse(text): + output = list(text) + output.reverse() + + return ''.join(output) +``` + +These approaches start with turning the text into a `list` of codepoints. +Rather than use a loop + append to then reverse the text, the [`list.reverse()`][list-reverse-method] method is used to perform an in-place reversal. +`join()` is then used to turn the list into a string. + +This takes `O(n)` time complexity because `list.reverse()` & `join()` iterate through the entire `list`. +It uses `O(n)` space for the output `list`. + +`list.reverse()` cannot be fed to `join()` here because it returns `None` as opposed to returning the `list`. +Because `list.reverse()` **mutates the list**, it is not advisable in situations where you want to preserve the original `list` of codepoints. + + +## Variation #1: Keep a Copy of the Original Ordering of Codepoints + + +```python +def reverse(text): + codepoints, output = list(text), list(text) + output.reverse() + return ''.join(output) +``` + +This variation is essentially the same as the solution above, but makes a codepoints list to keep the original codepoint ordering of the input text. +This does add some time and space overhead. + + +## Variation #2: Iterate Through the String and Append to Create List Before Reversing + + +```python +def reverse(text): + output = [] + + for item in text: + output.append(item) + + output.reverse() + + return ''.join(output) +``` + +This variation declares output as an empty literal, then loops through the codepoints of the input text and appends them to output. +`list.reverse()` is then called to reverse output in place. + Finally, output is joined into a string via `str.join()`. +Using this method is the same as calling the `list` constructor directly on the input text (_`list(text)`_), which will iterate through it automatically. + Calling the constructor is also quite a bit faster than using a "written out" `for-loop`. + + +## Timings vs Reverse Slice + + +As a (very) rough comparison, below is a timing table for these functions vs the canonical reverse slice: + +As you can see, using `list.reverse()` after converting the input text to a list is much slower than using a reverse slice. +Iterating in a loop to create the output list also adds even more time. + + +| **string lengths >>>>** | 5 | 11 | 22 | 52 | 66 | 86 | 142 | 1420 | 14200 | 142000 | +|------------------------- |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: | +| reverse slice | 1.70e-07 | 1.74e-07 | 1.00e-07 | 2.06e-07 | 2.20e-07 | 2.39e-07 | 3.59e-07 | 1.47e-06 | 1.22e-05 | 1.20e-04 | +| reverse reverse method | 3.28e-07 | 2.00e-07 | 5.39e-07 | 8.96e-07 | 1.35e-06 | 1.55e-06 | 2.31e-06 | 2.01e-05 | 1.93e-04 | 1.94e-03 | +| reverse iterate list | 4.74e-07 | 7.60e-07 | 1.25e-06 | 2.75e-06 | 3.53e-06 | 4.52e-06 | 7.22e-06 | 6.07e-05 | 5.84e-04 | 6.28e-03 | + + +Measurements were taken on a 3.1 GHz Quad-Core Intel Core i7 Mac running MacOS Ventura. +Tests used `timeit.Timer.autorange()`, repeated 3 times. +Time is reported in seconds taken per string after calculating the 'best of' time. +The [`timeit`][timeit] module docs have more details, and [note.nkmk.me][note_nkmk_me] has a nice summary of methods. + + +[list-reverse-method]: https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types +[note_nkmk_me]: https://note.nkmk.me/en/python-timeit-measure/ +[timeit]: https://docs.python.org/3/library/timeit.html#python-interface diff --git a/exercises/practice/reverse-string/.approaches/built-in-list-reverse/snippet.txt b/exercises/practice/reverse-string/.approaches/built-in-list-reverse/snippet.txt new file mode 100644 index 0000000000..8a999c3831 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/built-in-list-reverse/snippet.txt @@ -0,0 +1,5 @@ +def reverse(text): + output = list(text) + output.reverse() + + return ''.join(output) \ No newline at end of file diff --git a/exercises/practice/reverse-string/.approaches/built-in-reversed/content.md b/exercises/practice/reverse-string/.approaches/built-in-reversed/content.md new file mode 100644 index 0000000000..6205038262 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/built-in-reversed/content.md @@ -0,0 +1,54 @@ +# Use the built-in reversed() and Unpack with join() + + +```python +def reverse(text): + return (''.join(reversed(text))) +``` + +This approach calls the built-in `reversed()` function to return a [reverse iterator](https://docs.python.org/3/library/functions.html#reversed) that is then unpacked by `str.join()`. +This is equivalent to using a reverse slice, but incurs a bit of extra overhead due to the unpacking/iteration needed by `str.join()`. +This takes `O(n)` time and `O(n)` space for the reversed copy. + + +```python +def reverse(text): + output = '' + for index in reversed(range(len(text))): + output += text[index] + return output +``` + +This version uses `reversed()` to reverse a `range()` object rather than feed a start/stop/step to `range()` itself. +It then uses the reverse range to iterate over the input string and concatenate each code point to a new 'output' string. +This has over-complicated `reversed()`, as it can be called directly on the input string with almost no overhead. +This has also incurs the performance hit of repeated concatenation to the 'output' string. + +While this approach _looks_ as if it would be similar to the first, it is actually `O(n**2)` in time complexity due to string concatenation. +It was also the slowest in benchmarks. + + +## Timings vs Reverse Slice + + +As a (very) rough comparison, below is a timing table for these functions vs the canonical reverse slice: + +While `reversed()` is very fast, the call to `join()` to unpack slows things down compared to using a reverse slice. +For long strings, this slight overhead starts to become significant. +Using `reversed()` but concatenating to a string is non-performant in this context. + + +| **string length >>>>** | 5 | 11 | 22 | 52 | 66 | 86 | S142 | 1420 | 14200 | 142000 | +|------------------------ |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: | +| reverse slice | 1.70e-07 | 1.78e-07 | 1.89e-07 | 2.10e-07 | 2.25e-07 | 2.40e-07 | 3.56e-07 | 1.52e-06 | 1.22e-05 | 1.20e-04 | +| reverse reversed | 3.71e-07 | 4.77e-07 | 6.78e-07 | 1.20e-06 | 1.63e-06 | 1.01e-06 | 2.78e-06 | 2.47e-05 | 2.44e-04 | 2.40e-03 | +| reverse reversed range | 6.34e-07 | 1.05e-06 | 1.85e-06 | 3.85e-06 | 4.73e-06 | 6.10e-06 | 9.77e-06 | 1.44e-04 | 1.53e-03 | 1.89e-02 | + + +Measurements were taken on a 3.1 GHz Quad-Core Intel Core i7 Mac running MacOS Ventura. +Tests used `timeit.Timer.autorange()`, repeated 3 times. +Time is reported in seconds taken per string after calculating the 'best of' time. +The [`timeit`][timeit] module docs have more details, and [note.nkmk.me][note_nkmk_me] has a nice summary of methods. + +[note_nkmk_me]: https://note.nkmk.me/en/python-timeit-measure/ +[timeit]: https://docs.python.org/3/library/timeit.html#python-interface diff --git a/exercises/practice/reverse-string/.approaches/built-in-reversed/snippet.txt b/exercises/practice/reverse-string/.approaches/built-in-reversed/snippet.txt new file mode 100644 index 0000000000..a45b911005 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/built-in-reversed/snippet.txt @@ -0,0 +1,2 @@ +def reverse(text): + return (''.join(reversed(text))) diff --git a/exercises/practice/reverse-string/.approaches/config.json b/exercises/practice/reverse-string/.approaches/config.json new file mode 100644 index 0000000000..24fddb8f3e --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/config.json @@ -0,0 +1,57 @@ +{ + "introduction": { + "authors": ["BethanyG", "colinleach"], + "contributors": [] + }, + "approaches": [ + { + "uuid": "e124fe69-dbef-4aaf-8910-706b5e3ce6bd", + "slug": "sequence-slicing", + "title": "Sequence Slicing", + "blurb": "Use a slice with a negative step to reverse the string.", + "authors": ["BethanyG"] + }, + { + "uuid": "cbe2766f-e02f-4160-8227-eead7b4ca9fb", + "slug": "iteration-and-concatenation", + "title": "Iteration and Concatenation", + "blurb": "Iterate through the codepoints and concatenate them to a new string.", + "authors": ["BethanyG"] + }, + { + "uuid": "894b1c9b-e256-471e-96f6-02453476ccc4", + "slug": "backward-iteration-with-range", + "title": "Backward iteration with Range", + "blurb": "Use a negative step with range() to iterate backward and append to a new string.", + "authors": ["BethanyG"] + }, + { + "uuid": "722e8d0e-a8d1-49a7-9b6f-38da0f7380e6", + "slug": "list-and-join", + "title": "Make a list and use join()", + "blurb": "Create a list from the string and use join to make a new string.", + "authors": ["BethanyG"] + }, + { + "uuid": "b2c8e7fa-8265-4221-b0be-c1cd13166925", + "slug": "built-in-list-reverse", + "title": "Use the built-in list.reverse() function.", + "blurb": "Create a list of codepoints, use list.reverse() to reverse in place, and join() to make a new string.", + "authors": ["BethanyG"] + }, + { + "uuid": "cbb4411a-4652-45d7-b73c-ca116ccd4f02", + "slug": "built-in-reversed", + "title": "Use the built-in reversed() function.", + "blurb": "Use reversed() and unpack it with join() to make a new string.", + "authors": ["BethanyG"] + }, + { + "uuid": "1267e48f-edda-44a7-a441-a36155a8fba2", + "slug": "additional-approaches", + "title": "Additional approaches that are further afield", + "blurb": "Additional interesting approaches.", + "authors": ["BethanyG"] + } + ] +} \ No newline at end of file diff --git a/exercises/practice/reverse-string/.approaches/introduction.md b/exercises/practice/reverse-string/.approaches/introduction.md new file mode 100644 index 0000000000..264f76b589 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/introduction.md @@ -0,0 +1,173 @@ +# Introduction + + +The goal of the Reverse String exercise is to output a given string in reverse order. +It can be solved in a lot of different ways in Python, with a near-endless amount of variation. + +However, not all strategies are efficient, concise, or small in memory. +Care must be taken to not inadvertently slow down the code by using methods that don't scale well. + +Additionally, most 'canonical' solutions for reversing a string using the Python standard library do not account for Unicode text beyond the ASCII (0-127) range. + + +In this introduction, we cover six general approaches and an additional group of 'interesting' takes, but there are many more techniques that could be used. + +1. Sequence Slice with Negative Step +2. Iteration with String Concatenation +3. Reverse Iteration with Range() +4. Make a list and Use str.join() +5. Make a list and use list.reverse() +6. Use the built-in reversed() +7. Other [interesting approaches][approach-additional-approaches] + +We encourage you to experiment and get creative with the techniques you use, and see how it changes the way you think about the problem and think about Python. + + +And while Unicode text is outside the core tests for this exercise (_there are optional tests in the test file you can enable for Unicode_), we encourage you to give reversing strings that have non ASCII text a try. +We talk more about the considerations and strategies for Unicode text reversal (_and third-party libraries that can help_) in the [Working with Unicode Strings][article-working-with-unicode-strings] article. + + +## Approach: Sequence Slice with a Negative Step + +```python +def reverse(text): + return text[::-1] +``` + +This is "THE" canonical solution, _provided_ you know what encoding and character sets you are dealing with. +For example, if you know all of your text is **always** going to be within the ASCII space, this is by far the most succinct and performant way to reverse a string in Python. + +For more details, see the [sequence slicing approach][approach-sequence-slicing] + + +## Approach: Iterate over the String; Concatenate to a New String + + +```python +def reverse(text): + output = '' + for codepoint in text: + output = codepoint + output + return output +``` + +This approach iterates over the string, concatenating each codepoint to a new string. +This approach and its variants avoid all use of built-ins such as `range()`, `reversed()`, and `list.reverse()`. +But for very long strings, this approach can degrade performance toward O(n**2). + +For more information and relative performance timings for this group, check out the [iteration and concatenation][approach-iteration-and-concatenation] approach. + + +## Approach: Use range() to Iterate Backwards over the String, Append to New String + + +```python +def reverse(text): + new_word = "" + + for index in range(len(text) - 1, -1, -1): #For 'Robot', this is 4 (start) 0 (stop), iterating (4,3,2,1,0) + new_word += text[index] + return new_word +``` + +This method uses the built-in [`range()`][range] object to iterate over text right-to-left, adding each codepoint to the 'new_word' string. +This is essentially the same technique as the approach above, but incurs slightly less overhead by avoiding the potential performance hit of _prepending_ to the 'new_word' string, or creating index or tracking variables. + +For very long strings, this approach will still degrade to `O(n**2)` performance, due to the use of string concatenation. +Using `''.join()` here can avoid the concatenation penalty. +For more information and relative performance timings for this group, check out the [backwards iteration with range][approach-backward-iteration-with-range] approach. + + +## Approach: Create a List and Use str.join() to make new String. + + +```python +def reverse(text): + output = [] + + for codepoint in text: + output.insert(0,codepoint) + return "".join(output) +``` + +This approach either breaks the string up into a list of codepoints to swap or creates an empty list as a "parking place" to insert or append codepoints. +It then iterates over the text, swapping, inserting, or appending each codepoint to the output list. +Finally, `str.join()` is used to re-assemble the `list` into a string. + +For more variations and relative performance timings for this group, check out the [list and join][approach-list-and-join] approach. + + +## Approach: Make the Input Text a List & Use list.reverse() to Reverse in Place + + +```python +def reverse(text): + output = list(text) + output.reverse() + + return ''.join(output) +``` + +This approach turns the string into a list of codepoints and then uses the `list.reverse()` method to re-arrange the list _in place_. +After the reversal of the list, `str.join()` is used to create the reversed string. + +For more details, see the [built in list.reverse()][approach-built-in-list-reverse] approach. + + +## Approach: Use the built-in reversed() Function & join() to Unpack + + +```python +def reverse(text): + return (''.join(reversed(text))) +``` + +This approach calls the built-in `reversed()` function to return a [reverse iterator](https://docs.python.org/3/library/functions.html#reversed) that is then unpacked by `str.join()`. +This is equivalent to using a reverse slice, but incurs a bit of extra overhead due to the unpacking/iteration needed by `str.join()`. + +For more details, see the [built-in reversed()][approach-built-in-reversed] approach. + + +```python +def reverse(text): + output = '' + for index in reversed(range(len(text))): + output += text[index] + return output +``` + +This version uses `reversed()` to reverse a `range()` object rather than feed a start/stop/step to `range()` itself. +It then uses the reverse range to iterate over the input string and concatenate each code point to a new 'output' string. +This has over-complicated `reversed()` a bit, as it can be called directly on the input string with almost no overhead. +This has also incurred the performance hit of repeated concatenation to the 'output' string. + +## Other Interesting Approaches + +These range from using recursion to converting text to bytes before processing. +Some even use `map()` and or a `lambda` + +Take a look at the [additional approaches][approach-additional-approaches] 'approach' for more details and timings. + + +## Which Approach to Use? + +The fastest and most canonical by far is the reverse slice. +Unless you are in an interview situation where you need to "show your work", or working with varied Unicode outside the ASCII range, a reverse slice is the easiest and most direct method of reversal. + +A reverse slice will also work well for varied Unicode that has been pre-processed to ensure that multibyte characters and combined letters with diacritical and accent marks ('extended graphemes') remain grouped. + + +For other scenarios, converting the intput text to a `list`, swapping or iterating, and then using `join()` is recommended. + +To compare performance of these approach groups, see the [Performance article][article-performance]. + +[approach-additional-approaches]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/additional-approaches +[approach-backward-iteration-with-range]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/backward-iteration-with-range +[approach-built-in-list-reverse]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/built-in-list-reverse +[approach-built-in-reversed]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/built-in-reversed +[approach-iteration-and-concatenation]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/iteration-and-concatenation +[approach-list-and-join]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/list-and-join +[approach-sequence-slicing]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/sequence-slicing +[article-performance]: https://exercism.org/tracks/python/exercises/reverse-string/articles/performance +[article-working-with-unicode-strings]: https://exercism.org/tracks/python/exercises/reverse-string/.articles/working-with-unicode-strings +[range]: https://docs.python.org/3/library/stdtypes.html#range diff --git a/exercises/practice/reverse-string/.approaches/iteration-and-concatenation/content.md b/exercises/practice/reverse-string/.approaches/iteration-and-concatenation/content.md new file mode 100644 index 0000000000..7acc4d7c66 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/iteration-and-concatenation/content.md @@ -0,0 +1,110 @@ +# Iteration and Concatenation + + +```python +def reverse(text): + output = '' + for codepoint in text: + output = codepoint + output + return output +``` + +The variations here all iterate over the string, concatenating each codepoint to a new string. +While this avoids all use of built-ins such as `range()`, `reversed()`, and `list.reverse()`, it incurs both a memory and speed penalty over using a reverse slice. + +Strings are immutable in Python. +Using concatenation via `+` or `+=` forces the re-creation of the 'output' string for _every codepoint added from the input string._ +That means the code has a minimum time complexity of `O(m + n)`, where `n` is the length of the text being iterated over, and `m` is the number of concatenations to the 'output' string. +For some more detail on `O(n + m)` vs `O(n)`, see this [Stack Overflow post][time-complexity-omn-vs-on]. +The code also uses `O(n)` space to store 'output'. + +As input strings grow longer, concatenation can become even more problematic, and performance can degrade to `O(n**2)`, as longer and longer shifts and reallocations occur in memory. +In fact, the "standard" way to describe the time complexity of this code is to say that is O(n**2), or quadratic. + +Interestingly, CPython includes an optimization that attempts to avoid the worst of the shift and reallocation behavior by reusing memory when it detects that a string append is happening. +Because the code above _prepends_ the codepoint to the left-hand side of 'output', this optimization cannot be used. +Even in cases where strings are appended to, this optimization cannot be relied upon to be stable and is not transferable to other implementations of Python. + +For some interesting reading on this topic, see these Stack Overflow posts: +- [Time Complexity of String Concatenation in Python][time-complexity-of-string-concatenation-in-python], +- [Time Complexity of Iterative String Append][time-complexity-of-iterative-string-append], +- [Most efficient String Concatenation Method in Python][most-efficient-string-concatenation-method-in-Python], +- [join() is faster than +, but what is wrong here?][join() is faster than +, but what is wrong here?], and +- [Is += bad practice in Python?][is += bad practice in Python?] + +To see the difference between reverse slicing and looping in terms of steps, check out [slicing verses iterating+concatenation][python-tutor] at the PythonTutor site. + + +## Variation #1: Using a While Loop and a Negative Index + + +```python +def reverse(text): + output = '' + index = -1 + + while index >= -len(text): + output += text[index] + index -= 1 + return output +``` + +This solution uses a while loop to "count down" the length of the string using a negative index. +Each number is used to index into the input string and concatenate the resulting codepoint to a new string. +Because each index is further from zero than the last, this has the effect of "iterating backward" over the input string. + +This approach incurs additional overhead for length checking the input string repeatedly in the loop, and setting/decrementing the index variable, both of which can be avoided by using the built-in `range()` object. +Overall, this was the slowest of the three variations when timed. + + +## Variation #2: Using a While Loop with a Positive Index + + +```python +def reverse(text): + result ='' + index = len(text)-1 + + while index >= 0: + result += text[index] + index -= 1 + return result +``` + +This solution uses a while loop to "count down" the length of the string until it reaches zero using a positive index. +Each number is used to index into the input string and concatenate the resulting codepoint to a new string. +Because each index is closer to zero than the last, this has the effect of also "iterating backward" over the input string. +Algorithmically, this takes as much tine and space as the code samples above, since it uses an intermediate string for the reversal and must loop through every codepoint in the input. + + +## Timings vs Reverse Slice + + +As seen in the table below, all of these approaches are slower than using a reverse slice. +Interestingly, iteration + prepending to the string is fastest in this group for strings under length 1420. +But keep in mind that in general, string concatenation and prepending should be avoided for any 'industrial strength' use cases. + + +| **string length >>>>** | 5 | 11 | 22 | 52 | 66 | 86 | 142 | 1420 | 14200 | 142000 | +|------------------------ |---------- |---------- |---------- |---------- |---------- |---------- |---------- |---------- |---------- |---------- | +| reverse slice | 1.66e-07 | 1.73e-07 | 1.88e-07 | 1.12e-07 | 2.15e-07 | 2.32e-07 | 3.46e-07 | 1.42e-06 | 1.18e-05 | 1.15e-04 | +| reverse string prepend | 4.28e-07 | 8.05e-07 | 1.52e-06 | 3.45e-06 | 4.82e-06 | 5.55e-06 | 9.83e-06 | 2.23e-04 | 2.96e-03 | 5.17e-01 | +| reverse positive index | 4.65e-07 | 8.85e-07 | 1.73e-06 | 3.70e-06 | 4.83e-06 | 6.55e-06 | 1.01e-05 | 1.54e-04 | 1.60e-03 | 2.61e-02 | +| reverse negative index | 5.65e-07 | 1.32e-06 | 2.61e-06 | 5.91e-06 | 7.62e-06 | 4.00e-06 | 1.62e-05 | 2.16e-04 | 2.19e-03 | 2.48e-02 | + + +Measurements were taken on a 3.1 GHz Quad-Core Intel Core i7 Mac running MacOS Ventura. +Tests used `timeit.Timer.autorange()`, repeated 3 times. +Time is reported in seconds taken per string after calculating the 'best of' time. +The [`timeit`][timeit] module docs have more details, and [note.nkmk.me][note_nkmk_me] has a nice summary of methods. + +[python-tutor]: https://pythontutor.com/render.html#code=def%20reverse_loop%28text%29%3A%0A%20%20%20%20output%20%3D%20''%0A%20%20%20%20for%20letter%20in%20text%3A%0A%20%20%20%20%20%20%20%20output%20%3D%20letter%20%2B%20output%0A%20%20%20%20return%20output%0A%20%20%20%20%0Adef%20reverse_slice%28text%29%3A%0A%20%20%20%20return%20text%5B%3A%3A-1%5D%0A%20%20%20%20%0A%0Aprint%28reverse_loop%28'Robot'%29%29%0Aprint%28reverse_slice%28'Robot'%29%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=311&rawInputLstJSON=%5B%5D&textReferences=false + +[is += bad practice in Python?]: https://stackoverflow.com/questions/39675898/is-python-string-concatenation-bad-practice +[join() is faster than +, but what is wrong here?]: https://stackoverflow.com/a/1350289 +[most-efficient-string-concatenation-method-in-Python]: https://stackoverflow.com/questions/1316887/what-is-the-most-efficient-string-concatenation-method-in-python +[note_nkmk_me]: https://note.nkmk.me/en/python-timeit-measure/ +[time-complexity-of-iterative-string-append]: https://stackoverflow.com/questions/34008010/is-the-time-complexity-of-iterative-string-append-actually-on2-or-on +[time-complexity-of-string-concatenation-in-python]: https://stackoverflow.com/questions/37133547/time-complexity-of-string-concatenation-in-python +[time-complexity-omn-vs-on]: https://cs.stackexchange.com/questions/139307/time-complexity-omn-vs-on +[timeit]: https://docs.python.org/3/library/timeit.html#python-interface diff --git a/exercises/practice/reverse-string/.approaches/iteration-and-concatenation/snippet.txt b/exercises/practice/reverse-string/.approaches/iteration-and-concatenation/snippet.txt new file mode 100644 index 0000000000..d4758b0601 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/iteration-and-concatenation/snippet.txt @@ -0,0 +1,5 @@ +def reverse(text): + output = '' + for codepoint in text: + output = codepoint + output + return output diff --git a/exercises/practice/reverse-string/.approaches/list-and-join/content.md b/exercises/practice/reverse-string/.approaches/list-and-join/content.md new file mode 100644 index 0000000000..07f7daa03f --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/list-and-join/content.md @@ -0,0 +1,148 @@ +# Create a List and Use str.join() to Make A New String + + +To avoid performance issues with concatenating to a string, this group of approaches uses one or more `list`s to perform swaps or reversals before joining the codepoints back into a string. +This avoids the `O(n**2)` danger of repeated shifting/reallocation when concatenating long strings. +However, the use of `join()` and other techniques still make all of these solutions `O(n)` - `O(n+m)` in time complexity. + + +```python +def reverse(text): + output = [] + + for codepoint in text: + output.insert(0,codepoint) + return "".join(output) +``` + +The code above iterates over the codepoints in the input text and uses `list.insert()` to insert each one into the output list. +Note that `list.insert(0, codepoint)` _prepends_, which is very inefficient for `lists`, while appending takes place in (amortized) O(1) time. +So this code incurs a time penalty because it forces repeated shifts of every element in the list with every insertion. +A small re-write using `range()` to change the iteration direction will boost performance: + + +## Variation #1: Use Range to Iterate Over the String Backward and list.append() to Output + + +```python +def reverse(text): + output = [] + length = len(text)-1 + + for index in range(length, -1, -1): + output.append(text[index]) + return "".join(output) +``` + +This code iterates backward over the string using `range()`, and can therefore use `list.append()` to append to the output list in (amortized) constant time. +However, the use of `join()` to unpack the list and create a string still makes this `O(n)`. +This also takes `O(n)` space for the output `list`. + + +## Variation #2: Convert Text to List and Use range() to Iterate over 1/2 the String, Swapping Values + + +```python +def reverse(text): + output = list(text) + length = len(text) // 2 #Cut the amount of iteration in half + + for index in range(length): + + #Swap values at given indexes + output[index], output[length - index - 1] = output[length - index - 1], output[index] + return ''.join(output) +``` + + +This variation calculates a median which is then used with `range()` in a `for loop` to iterate over _half_ the indexes in the 'output' list, swapping values into their reversed places. +`str.join()` is then used to create a new string. +This technique is quite speedy, and re-arranges the list of codepoints 'in place', avoiding expensive string concatenation. +It is still `O(n)` time complexity because `list()` and `join()` each force iteration over the entire length of the input string. + + +## Variation #3: Convert Text to List, Use Start and End Variables to Iterate and Swap Values + + +```python +def reverse(text): + output = list(text) + start = 0 + end = len(text) - 1 + + while start < end: + #Swap values in output until the indexes meet at the 'center' + output[start], output[end] = output[end], output[start] + start += 1 + end -= 1 + return "".join(output) +``` + + +This variation 'automatically' finds the midpoint by incrementing and decrementing 'start' and 'end' variables. +Otherwise, it is identical to variation 2. + + +## Variation #4: Convert Text to Bytearray, Iterate and Swap + + +```python +def reverse(text): + output = bytearray(text.encode("utf-8")) + length = len(output) + + for index in range(length//2): + output[index], output[length-1-index] = output[length-1-index], output[index] + return output.decode("utf-8") +``` + + +This variation is operationally the same as variations #2 & #3 above, except that it encodes the string to a `utf-8` [bytearray](https://docs.python.org/3/library/stdtypes.html#bytearray). + It then iterates over the bytearray to perform the swaps. +Finally, the bytearray is decoded into a `utf-8` string to return the reversed word. +This incurs overhead when encoding/decoding to and from the `bytearray`. +This also throws an ` UnicodeDecodeError: invalid start byte` when working with any multi-byte codepoints because no check was conducted to keep multibyte codepoints grouped together during the reversal. + +Because of this issue, no timings are available for this variation. +For code that keeps bytes together correctly, see the bytearray variation in the [additional approaches][approach-additional-approaches] approach. + + +## Variation #5: Use Generator Expression with Join to Iterate Backwards Over Codepoints List + +```python +def reverse(text): + codepoints = list(text) + length = len(text) - 1 + return "".join(codepoints[index] for index in range(length, -1, -1)) +``` + +This variation puts the for/while loop used in other strategies directly into `join()` using a generator expression. +The text is first converted to a list and the generator-expression "swaps" the codepoints over the whole `list`, using `range()` for the indexes. +Interestingly, because of the work to create and manage the generator, this variation is actually _slower_ than using an auxiliary `list` and `loop` to manage codepoints and then calling `join()` separately. + + +## Timings vs Reverse Slice + + +As a (very) rough comparison, below is a timing table for these functions vs the canonical reverse slice: + + +| **string lengths >>>>** | 5 | 11 | 22 | 52 | 66 | 86 | 142 | 1420 | 14200 | 142000 | +|------------------------- |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: |:--------: | +| reverse slice | 1.67e-07 | 1.76e-07 | 1.85e-07 | 2.03e-07 | 2.12e-07 | 2.32e-07 | 3.52e-07 | 1.47e-06 | 1.20e-05 | 1.17e-04 | +| reverse auto half swap | 4.59e-07 | 7.53e-07 | 1.16e-06 | 2.25e-06 | 3.08e-06 | 3.80e-06 | 5.97e-06 | 7.08e-05 | 7.21e-04 | 7.18e-03 | +| reverse half swap | 6.34e-07 | 9.24e-07 | 1.51e-06 | 2.91e-06 | 3.71e-06 | 4.53e-06 | 7.52e-06 | 2.52e-04 | 1.01e-03 | 1.05e-02 | +| reverse append | 6.44e-07 | 1.00e-06 | 1.56e-06 | 3.28e-06 | 4.48e-06 | 5.54e-06 | 8.89e-06 | 2.20e-04 | 8.73e-04 | 9.10e-03 | +| reverse generator join | 1.02e-06 | 1.39e-06 | 2.16e-06 | 4.13e-06 | 5.31e-06 | 6.79e-06 | 1.11e-05 | 1.07e-04 | 1.07e-03 | 1.05e-02 | +| reverse insert | 5.29e-07 | 9.10e-07 | 1.64e-06 | 3.77e-06 | 4.90e-06 | 6.86e-06 | 1.14e-05 | 2.70e-04 | 2.35e-02 | 2.74e+00 | + + + +Measurements were taken on a 3.1 GHz Quad-Core Intel Core i7 Mac running MacOS Ventura. +Tests used `timeit.Timer.autorange()`, repeated 3 times. +Time is reported in seconds taken per string after calculating the 'best of' time. +The [`timeit`][timeit] module docs have more details, and [note.nkmk.me][note_nkmk_me] has a nice summary of methods. + +[note_nkmk_me]: https://note.nkmk.me/en/python-timeit-measure/ +[timeit]: https://docs.python.org/3/library/timeit.html#python-interface +[approach-additional-approaches]: https://exercism.org/tracks/python/exercises/reverse-string/.approaches/additional-approaches diff --git a/exercises/practice/reverse-string/.approaches/list-and-join/snippet.txt b/exercises/practice/reverse-string/.approaches/list-and-join/snippet.txt new file mode 100644 index 0000000000..4ecd5cb181 --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/list-and-join/snippet.txt @@ -0,0 +1,8 @@ +def reverse(text): + output = list(text) + start, end = 0, len(text) - 1 + while start < end: + output[start], output[end] = output[end], output[start] + start += 1 + end -= 1 + return "".join(output) \ No newline at end of file diff --git a/exercises/practice/reverse-string/.approaches/sequence-slicing/content.md b/exercises/practice/reverse-string/.approaches/sequence-slicing/content.md new file mode 100644 index 0000000000..2c85dbf19c --- /dev/null +++ b/exercises/practice/reverse-string/.approaches/sequence-slicing/content.md @@ -0,0 +1,48 @@ +# Sequence Slice with Negative Step + + +```python +def reverse(text): + return text[::-1] +``` + +This approach uses Python's negative indexes and _[sequence slices][sequence slicing]_ to iterate over the string in reverse order, returning a reversed copy. + + +
index from left ⟹ |
+
+| 0 👇🏾 | 1 👇🏾 | 2 👇🏾 | 3 👇🏾 | 4 👇🏾 | 5 👇🏾 | +|:--------: |:--------: |:--------: |:--------: |:--------: |:--------: | +| P | y | t | h | o | n | +| 👆🏾 -6 | 👆🏾 -5 | 👆🏾 -4 | 👆🏾 -3 | 👆🏾 -2 | 👆🏾 -1 | + | ⟸ index from right |
+