Releases: wasilibs/go-re2
v1.8.0
This release finally upgrades the bundled Wasm version of re2 to the latest 2024-07-02. Their switch to having a dependency on abseil meant reworking the build somewhat but we are now able to build it and should be able to keep up with any updates going forward through the new automated update workflow. Note that there haven't otherwise been big changes or security fixes in upstream re2 so we expect this to not have a significant effect on users, but it should feel good to be on the latest release.
Unfortunately, abseil has a hard requirement on threads support, meaning we cannot build it for TinyGo. As such, this release also removes official TinyGo support, notably removing the prebuilt Wasm artifacts. Some build tags are maintained so it is possible to use the library with TinyGo if the project brings their own Wasm archives by setting the cgo LDFLAGS variable correctly. One example of this is in coraza-wasilibs.
v1.7.0
v1.6.0
This is a relatively large release. While working with @itaischwartz debugging some issues with the cgo backend under extreme memory pressure, we found a need to rework how we execute Replace*
methods, by moving more logic from C++ into Go. The end result not only should help with the cgo issue, but perform a little better by avoiding an extra allocation.
More importantly though, it allowed us to implement ReplaceFunc
, which had been a missing API since launching the project. With that, the only methods not supported are *Reader
, which are expected to be uncommonly used. Hopefully this makes it even easier to just "drop-in" go-re2 into an existing project.
In addition, a custom allocator that was used on unix is now also implemented for Windows, which should improve compatibility when running on systems with less available RAM. A fix was also made so the unix allocator can work on FreeBSD.
v1.5.3
This release fixes additional issues related to memory allocation.
- A custom allocator is used on unix. Previously, systems with strict
vm.memory_overcommit
settings could fail to load go-re2 due to limited total virtual address space. - 32-bit systems are detected and have lower limits on allocation of virtual memory space. Note that on 32-bit systems, go-re2 will use a Wasm interpreter and will likely perform similarly or worse to the standard library.
Full Changelog: v1.5.2...v1.5.3
v1.5.2
v1.5.1
This fixes an issue with the recently introduced shared memory on machines with lower total memory. Previously, we always attempted to allocate a static 4GB of virtual memory (this is not physical memory which reflects actual usage). On systems with 4GB or less, this can fail since the OS does not allowing virtual memory that cannot be served by RAM or swap. Now, we check total memory at runtime and reduce the virtual memory allocation if less than 4GB is available.
mmap behavior is platform specific so there could still be some issues with this elastic approach. Please let us know if you see any issues related to memory allocation to help us understand platform behavior better.
v1.5.0
This release is a huge one for the Wasm backend. tl;dr, memory usage should be dramatically reduced with a small improvement in CPU performance. This means that now, for any application with relatively complex regular expressions, there is expected to be no tradeoff for using this library.
There are two notable changes in this release, using shared memory among compiled regex's, and updating to wazero's new optimizing compiler.
Previously, WebAssembly only supported single-threaded execution within an instance of the Wasm module. To allow some concurrency within applications using go-re2's Wasm backend, we instantiated the Wasm module once per regex. This means that an individual regex still cannot be accessed concurrently, but multiple ones can. This in practice results in decent concurrency. However, the memory overhead of a fully instantiated Wasm module is relatively high - imagine all the error messages possible for invalid regular expressions, and them being copied into every single instantiated module. That is just one example of significant waste with this model.
With WebAssembly threads support for shared memory, we can now instead instantiate the module once and rely on atomic operations within the Wasm to allow safe access to memory in the context of concurrent access. This removes wasted copies of static resources and minimizes the overhead of the Wasm backend. In some cases, we see several hundred of megabytes improvement in overhead.
Unfortunately, switching to shared memory meant losing some optimizations such as buffer reuse that assumed non-concurrent access to memory. The overall performance impact from this seemed to be about 10-15% - it would probably still be worth it for the memory savings, but it's never fun to introduce performance regressions. Luckily, our friends at wazero launched a completely new optimizing compiler backend with a large improvement in performance. For go-re2, we saw around 20% improvement from switching to the new compiler. This means that overall, with this release, performance should be mostly the same with some possible single-digit gains, but with the memory usage tradeoff of the Wasm backend eliminated. The optimizing compiler was a huge effort and special thanks to @mathetake for driving that through to completion.
v1.5.0-pre.1
This release significantly reworks the WebAssembly machinery by enabling threads support. Before it, we needed to make sure memory within regex processing was not concurrently accessed by multiple threads, and the mechanism we chose was to have Wasm memory per-regex and a per-regex lock. This means concurrency among different regex's being evaluated, but it also meant a large amount of extra memory used since the Wasm memory is monolothic for the entire re2 library.
Now, we use shared memory and regex locking is handled within re2 itself. This means that memory usage goes down dramatically - applications that use the cgo backend due to memory issues may want to consider Wasm again.
As logic has changed substantially and is using new features like Wasm threads, we will allow some baking time with pre.* releases.
Full Changelog: v1.4.1...v1.5.0-pre.1
v1.4.1
This release fixes concurrency issues related to getting subgroup names and cgo result pointers. Notably, the race detector is now on enabled on CI tests.
Full Changelog: v1.4.0...v1.4.1
v1.4.0
1.4.0 fixes a memory leak when using the ReplaceAll
family of functions where there are no matches to replace. It also updates wazero to 1.3, which includes various performance improvements.
Full Changelog: v1.3.0...v1.4.0