diff --git a/.gitignore b/.gitignore index 9ce9f61..2ae49ce 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,7 @@ .DS_Store /.build /Packages +.vscode/ xcuserdata/ DerivedData/ .swiftpm/configuration/registries.json @@ -56,8 +57,11 @@ fastlane/report.xml fastlane/Preview.html fastlane/screenshots fastlane/test_output +fastlane/benchmark_data +fastlane/upload_folder ### Xcode Patch ### +**/*.xcconfig *.xcodeproj/* !*.xcodeproj/project.pbxproj !*.xcodeproj/xcshareddata/ diff --git a/BENCHMARKS.md b/BENCHMARKS.md new file mode 100644 index 0000000..d5c0808 --- /dev/null +++ b/BENCHMARKS.md @@ -0,0 +1,120 @@ +# WhisperKit Benchmarks + +This document describes how to run the benchmarks for WhisperKit. The benchmarks can be run on a specific device or all connected devices. The results are saved in JSON files and can be uploaded to the [argmaxinc/whisperkit-evals-dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset) dataset on HuggingFace as a pull request. Below are the steps to run the benchmarks locally in order to reproduce the results shown in our [WhisperKit Benchmarks](https://huggingface.co/spaces/argmaxinc/whisperkit-benchmarks) space. + +## Download the Source + +To download the code to run the test suite, run: + +```sh +git clone git@github.com:argmaxinc/WhisperKit.git +``` + +## Local Environment + +Before running the benchmarks, you'll need to set up your local environment with the necessary dependencies. To do this, run: + +```sh +make setup +``` + +See [Contributing](CONTRIBUTING.md) for more information. + + +## Xcode Environment + +When running the tests, the model to test needs is provided to the Xcode from Fastlane as an environment variable: + +1. Open the example project: + +```sh +xed Examples/WhisperAX +``` + +2. At the top, you will see the app icon and `WhisperAX` written next to it. Click on `WhisperAX` and select `Edit Scheme` at the bottom. + +3. Under `Environment Variables`, you will see an entry with `MODEL_NAME` as the name and `$(MODEL_NAME)` as the value. + +## Devices + +> [!IMPORTANT] +> An active developer account is required to run the tests on physical devices. + +Before running tests, all external devices need to be connected and paired to your Mac, as well as registered with your developer account. Ensure the devices are in Developer Mode. If nothing appears after connecting the devices via cable, press `Command + Shift + 2` to open the list of devices and track their progress. + +## Datasets + +The datasets for the test suite can be set in a global array called `datasets` in the file [`Tests/WhisperKitTests/RegressionTests.swift`](Tests/WhisperKitTests/RegressionTests.swift). It is prefilled with the datasets that are currently available. + +## Models + +The models for the test suite can be set in the [`Fastfile`](fastlane/Fastfile). Simply find `BENCHMARK_CONFIGS` and modify the `models` array under the benchmark you want to run. + +## Makefile and Fastlane + +The tests are run using [Fastlane](fastlane/Fastfile), which is controlled by a [Makefile](Makefile). The Makefile contains the following commands: + +### List Connected Devices + +Before running the tests it might be a good idea to list the connected devices to resolve any connection issues. Simply run: + +```sh +make list-devices +``` + +The output will be a list with entries that look something like this: + +```ruby +{ + :name=>"My Mac", + :type=>"Apple M2 Pro", + :platform=>"macOS", + :os_version=>"15.0.1", + :product=>"Mac14,12", + :id=>"XXXXXXXX-1234-5678-9012-XXXXXXXXXXXX", + :state=>"connected" +} +``` + +Verify that the devices are connected and the state is `connected`. + +### Running Benchmarks + +After completing the above steps, you can run the tests. Note that there are two different test configurations: one named `full` and the other named `debug`. To check for potential errors, run the `debug` tests: + +```sh +make benchmark-devices DEBUG=true +``` + +Otherwise run the `full` tests: + +```sh +make benchmark-devices +``` + +Optionally, for both tests, you can specify the list of devices for the tests using the `DEVICES` option: + +```sh +make benchmark-devices DEVICES="iPhone 15 Pro Max,My Mac" +``` + +The `DEVICES` option is a comma-separated list of device names. The device names can be found by running `make list-devices` and using the value for the `:name` key. + +### Results + +After the tests are run, the generated results can be found under `fastlane/benchmark_data` including the .xcresult file with logs and attachments for each device. There will also be a folder called `fastlane/upload_folder/benchmark_data` that contains only the JSON results in `fastlane/benchmark_data` that can used for further analysis. + +We will periodically run these tests on a range of devices and upload the results to the [argmaxinc/whisperkit-evals-dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset), which will propagate to the [WhisperKit Benchmarks](https://huggingface.co/spaces/argmaxinc/whisperkit-benchmarks) space and be available for comparison. + + +# Troubleshooting + + +If you encounter issues while running the tests, heres a few things to try: + +1. Open the project in Xcode and run the tests directly from there. + 1. To do this, open the example app (from command line type: `xed Examples/WhisperAX`) and run the test named `RegressionTests/testModelPerformanceWithDebugConfig` from the test navigator. + 2. If the tests run successfully, you can rule out any issues with the device or the models. + 3. If they dont run successfully, Xcode will provide more detailed error messages. +2. Try specifying a single device to run the tests on. This can be done by running `make list-devices` and then running the tests with the `DEVICES` option set to the name of the device you want to test on. For example, `make benchmark-devices DEVICES="My Mac"`. This will also enable you to see the logs for that specific device. +3. If you are still encountering issues, please reach out to us on the [Discord](https://discord.gg/G5F5GZGecC) or create an [issue](https://github.com/argmaxinc/WhisperKit/issues) on GitHub. diff --git a/Examples/WhisperAX/Debug.xcconfig b/Examples/WhisperAX/Debug.xcconfig new file mode 100644 index 0000000..bd29b90 --- /dev/null +++ b/Examples/WhisperAX/Debug.xcconfig @@ -0,0 +1,8 @@ +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + +// Configuration settings file format documentation can be found at: +// https://help.apple.com/xcode/#/dev745c5c974 + +CODE_SIGN_STYLE=Automatic +DEVELOPMENT_TEAM= diff --git a/Examples/WhisperAX/WhisperAX.xcodeproj/project.pbxproj b/Examples/WhisperAX/WhisperAX.xcodeproj/project.pbxproj index bfb9069..c70edeb 100644 --- a/Examples/WhisperAX/WhisperAX.xcodeproj/project.pbxproj +++ b/Examples/WhisperAX/WhisperAX.xcodeproj/project.pbxproj @@ -8,6 +8,18 @@ /* Begin PBXBuildFile section */ 161136102B3F6C68003C20F6 /* WhisperKit in Frameworks */ = {isa = PBXBuildFile; productRef = 1611360F2B3F6C68003C20F6 /* WhisperKit */; }; + 164F15AF2C91A51D00C715CB /* DistanceCalculation.swift in Sources */ = {isa = PBXBuildFile; fileRef = 164F15AE2C91A51D00C715CB /* DistanceCalculation.swift */; }; + 16548D3A2C7BB90B002BAE86 /* NormalizeEn.swift in Sources */ = {isa = PBXBuildFile; fileRef = 16548D292C7BB90B002BAE86 /* NormalizeEn.swift */; }; + 16548D3B2C7BB90B002BAE86 /* SpellingMapping.swift in Sources */ = {isa = PBXBuildFile; fileRef = 16548D2A2C7BB90B002BAE86 /* SpellingMapping.swift */; }; + 16548D3C2C7BB90B002BAE86 /* WERUtils.swift in Sources */ = {isa = PBXBuildFile; fileRef = 16548D2B2C7BB90B002BAE86 /* WERUtils.swift */; }; + 16548D3D2C7BB90B002BAE86 /* es_test_clip.wav in Resources */ = {isa = PBXBuildFile; fileRef = 16548D2D2C7BB90B002BAE86 /* es_test_clip.wav */; }; + 16548D3E2C7BB90B002BAE86 /* ja_test_clip.wav in Resources */ = {isa = PBXBuildFile; fileRef = 16548D2E2C7BB90B002BAE86 /* ja_test_clip.wav */; }; + 16548D3F2C7BB90B002BAE86 /* jfk_441khz.m4a in Resources */ = {isa = PBXBuildFile; fileRef = 16548D2F2C7BB90B002BAE86 /* jfk_441khz.m4a */; }; + 16548D402C7BB90B002BAE86 /* jfk.wav in Resources */ = {isa = PBXBuildFile; fileRef = 16548D302C7BB90B002BAE86 /* jfk.wav */; }; + 16548D412C7BB90B002BAE86 /* ted_60.m4a in Resources */ = {isa = PBXBuildFile; fileRef = 16548D312C7BB90B002BAE86 /* ted_60.m4a */; }; + 16548D432C7BB90B002BAE86 /* RegressionTestUtils.swift in Sources */ = {isa = PBXBuildFile; fileRef = 16548D342C7BB90B002BAE86 /* RegressionTestUtils.swift */; }; + 16548D442C7BB90B002BAE86 /* RegressionTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 16548D352C7BB90B002BAE86 /* RegressionTests.swift */; }; + 16548D452C7BB90B002BAE86 /* TestUtils.swift in Sources */ = {isa = PBXBuildFile; fileRef = 16548D362C7BB90B002BAE86 /* TestUtils.swift */; }; 1677AFC22B57618A008C61C0 /* WhisperAXApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = 1677AFAB2B57618A008C61C0 /* WhisperAXApp.swift */; }; 1677AFC42B57618A008C61C0 /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 1677AFAE2B57618A008C61C0 /* Preview Assets.xcassets */; }; 1677AFC92B57618A008C61C0 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 1677AFB42B57618A008C61C0 /* Assets.xcassets */; }; @@ -85,6 +97,18 @@ 161135F02B3F66DC003C20F6 /* WhisperAX Watch AppTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = "WhisperAX Watch AppTests.xctest"; sourceTree = BUILT_PRODUCTS_DIR; }; 161135FA2B3F66DC003C20F6 /* WhisperAX Watch AppUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = "WhisperAX Watch AppUITests.xctest"; sourceTree = BUILT_PRODUCTS_DIR; }; 1626683A2BB90CC9008F950A /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist; path = Info.plist; sourceTree = ""; }; + 164F15AE2C91A51D00C715CB /* DistanceCalculation.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = DistanceCalculation.swift; sourceTree = ""; }; + 16548D292C7BB90B002BAE86 /* NormalizeEn.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = NormalizeEn.swift; sourceTree = ""; }; + 16548D2A2C7BB90B002BAE86 /* SpellingMapping.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = SpellingMapping.swift; sourceTree = ""; }; + 16548D2B2C7BB90B002BAE86 /* WERUtils.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = WERUtils.swift; sourceTree = ""; }; + 16548D2D2C7BB90B002BAE86 /* es_test_clip.wav */ = {isa = PBXFileReference; lastKnownFileType = audio.wav; path = es_test_clip.wav; sourceTree = ""; }; + 16548D2E2C7BB90B002BAE86 /* ja_test_clip.wav */ = {isa = PBXFileReference; lastKnownFileType = audio.wav; path = ja_test_clip.wav; sourceTree = ""; }; + 16548D2F2C7BB90B002BAE86 /* jfk_441khz.m4a */ = {isa = PBXFileReference; lastKnownFileType = file; path = jfk_441khz.m4a; sourceTree = ""; }; + 16548D302C7BB90B002BAE86 /* jfk.wav */ = {isa = PBXFileReference; lastKnownFileType = audio.wav; path = jfk.wav; sourceTree = ""; }; + 16548D312C7BB90B002BAE86 /* ted_60.m4a */ = {isa = PBXFileReference; lastKnownFileType = file; path = ted_60.m4a; sourceTree = ""; }; + 16548D342C7BB90B002BAE86 /* RegressionTestUtils.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = RegressionTestUtils.swift; sourceTree = ""; }; + 16548D352C7BB90B002BAE86 /* RegressionTests.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = RegressionTests.swift; sourceTree = ""; }; + 16548D362C7BB90B002BAE86 /* TestUtils.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = TestUtils.swift; sourceTree = ""; }; 1677AFA62B57618A008C61C0 /* WhisperAX_Watch_AppUITests.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = WhisperAX_Watch_AppUITests.swift; sourceTree = ""; }; 1677AFA72B57618A008C61C0 /* WhisperAX_Watch_AppUITestsLaunchTests.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = WhisperAX_Watch_AppUITestsLaunchTests.swift; sourceTree = ""; }; 1677AFA92B57618A008C61C0 /* WhisperAX.entitlements */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.entitlements; path = WhisperAX.entitlements; sourceTree = ""; }; @@ -103,6 +127,7 @@ 167B345E2B05431E0076F261 /* WhisperAX.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = WhisperAX.app; sourceTree = BUILT_PRODUCTS_DIR; }; 167B34712B05431F0076F261 /* WhisperAXTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = WhisperAXTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; }; 167B347B2B05431F0076F261 /* WhisperAXUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = WhisperAXUITests.xctest; sourceTree = BUILT_PRODUCTS_DIR; }; + 169144182CCEEE87009903CA /* Debug.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = Debug.xcconfig; sourceTree = ""; }; /* End PBXFileReference section */ /* Begin PBXFrameworksBuildPhase section */ @@ -153,6 +178,41 @@ /* End PBXFrameworksBuildPhase section */ /* Begin PBXGroup section */ + 16548D2C2C7BB90B002BAE86 /* Evaluate */ = { + isa = PBXGroup; + children = ( + 164F15AE2C91A51D00C715CB /* DistanceCalculation.swift */, + 16548D292C7BB90B002BAE86 /* NormalizeEn.swift */, + 16548D2A2C7BB90B002BAE86 /* SpellingMapping.swift */, + 16548D2B2C7BB90B002BAE86 /* WERUtils.swift */, + ); + path = Evaluate; + sourceTree = ""; + }; + 16548D322C7BB90B002BAE86 /* Resources */ = { + isa = PBXGroup; + children = ( + 16548D2D2C7BB90B002BAE86 /* es_test_clip.wav */, + 16548D2E2C7BB90B002BAE86 /* ja_test_clip.wav */, + 16548D2F2C7BB90B002BAE86 /* jfk_441khz.m4a */, + 16548D302C7BB90B002BAE86 /* jfk.wav */, + 16548D312C7BB90B002BAE86 /* ted_60.m4a */, + ); + path = Resources; + sourceTree = ""; + }; + 16548D382C7BB90B002BAE86 /* WhisperKitTests */ = { + isa = PBXGroup; + children = ( + 16548D2C2C7BB90B002BAE86 /* Evaluate */, + 16548D322C7BB90B002BAE86 /* Resources */, + 16548D352C7BB90B002BAE86 /* RegressionTests.swift */, + 16548D342C7BB90B002BAE86 /* RegressionTestUtils.swift */, + 16548D362C7BB90B002BAE86 /* TestUtils.swift */, + ); + path = WhisperKitTests; + sourceTree = ""; + }; 1677AFA52B57618A008C61C0 /* WhisperAXWatchAppUITests */ = { isa = PBXGroup; children = ( @@ -214,6 +274,7 @@ isa = PBXGroup; children = ( 1677AFBC2B57618A008C61C0 /* WhisperAXTests.swift */, + 16548D382C7BB90B002BAE86 /* WhisperKitTests */, ); path = WhisperAXTests; sourceTree = ""; @@ -246,6 +307,7 @@ 167B34552B05431E0076F261 = { isa = PBXGroup; children = ( + 169144182CCEEE87009903CA /* Debug.xcconfig */, 1677AFA82B57618A008C61C0 /* WhisperAX */, 1677AFBB2B57618A008C61C0 /* WhisperAXTests */, 1677AFB82B57618A008C61C0 /* WhisperAXUITests */, @@ -402,7 +464,7 @@ attributes = { BuildIndependentTargetsInParallel = 1; LastSwiftUpdateCheck = 1520; - LastUpgradeCheck = 1520; + LastUpgradeCheck = 1600; TargetAttributes = { 161135DD2B3F66DA003C20F6 = { CreatedOnToolsVersion = 15.1; @@ -439,7 +501,6 @@ mainGroup = 167B34552B05431E0076F261; packageReferences = ( 161135D62B3F66A6003C20F6 /* XCLocalSwiftPackageReference "../.." */, - 16D581062B4F7DCE000C0AB0 /* XCRemoteSwiftPackageReference "swift-markdown-ui" */, ); productRefGroup = 167B345F2B05431E0076F261 /* Products */; projectDirPath = ""; @@ -494,6 +555,11 @@ isa = PBXResourcesBuildPhase; buildActionMask = 2147483647; files = ( + 16548D3D2C7BB90B002BAE86 /* es_test_clip.wav in Resources */, + 16548D3E2C7BB90B002BAE86 /* ja_test_clip.wav in Resources */, + 16548D402C7BB90B002BAE86 /* jfk.wav in Resources */, + 16548D3F2C7BB90B002BAE86 /* jfk_441khz.m4a in Resources */, + 16548D412C7BB90B002BAE86 /* ted_60.m4a in Resources */, ); runOnlyForDeploymentPostprocessing = 0; }; @@ -546,8 +612,15 @@ isa = PBXSourcesBuildPhase; buildActionMask = 2147483647; files = ( + 16548D3A2C7BB90B002BAE86 /* NormalizeEn.swift in Sources */, + 16548D3B2C7BB90B002BAE86 /* SpellingMapping.swift in Sources */, + 16548D432C7BB90B002BAE86 /* RegressionTestUtils.swift in Sources */, + 16548D452C7BB90B002BAE86 /* TestUtils.swift in Sources */, 1677AFDA2B5763BA008C61C0 /* WhisperAXUITestsLaunchTests.swift in Sources */, 1677AFDC2B5763C0008C61C0 /* WhisperAXTests.swift in Sources */, + 164F15AF2C91A51D00C715CB /* DistanceCalculation.swift in Sources */, + 16548D3C2C7BB90B002BAE86 /* WERUtils.swift in Sources */, + 16548D442C7BB90B002BAE86 /* RegressionTests.swift in Sources */, ); runOnlyForDeploymentPostprocessing = 0; }; @@ -593,6 +666,7 @@ /* Begin XCBuildConfiguration section */ 161136032B3F66DC003C20F6 /* Debug */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor; @@ -600,7 +674,7 @@ CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; DEVELOPMENT_ASSET_PATHS = "\"WhisperAXWatchApp/Preview Content\""; - DEVELOPMENT_TEAM = PP83DTRKSA; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; ENABLE_PREVIEWS = YES; GENERATE_INFOPLIST_FILE = YES; INFOPLIST_KEY_NSMicrophoneUsageDescription = "Required to record audio from the microphone for transcription."; @@ -626,6 +700,7 @@ }; 161136042B3F66DC003C20F6 /* Release */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor; @@ -633,19 +708,19 @@ CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; DEVELOPMENT_ASSET_PATHS = "\"WhisperAXWatchApp/Preview Content\""; - DEVELOPMENT_TEAM = PP83DTRKSA; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; ENABLE_PREVIEWS = YES; GENERATE_INFOPLIST_FILE = YES; INFOPLIST_KEY_NSMicrophoneUsageDescription = "Required to record audio from the microphone for transcription."; INFOPLIST_KEY_UISupportedInterfaceOrientations = "UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown"; - INFOPLIST_KEY_WKCompanionAppBundleIdentifier = com.argmax.whisperkit.WhisperAX; + INFOPLIST_KEY_WKCompanionAppBundleIdentifier = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}"; INFOPLIST_KEY_WKRunsIndependentlyOfCompanionApp = YES; LD_RUNPATH_SEARCH_PATHS = ( "$(inherited)", "@executable_path/Frameworks", ); MARKETING_VERSION = 0.1.2; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAX.watchapp; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}.watchapp"; PRODUCT_NAME = "WhisperAX Watch App"; PROVISIONING_PROFILE_SPECIFIER = ""; SDKROOT = watchos; @@ -660,15 +735,15 @@ }; 1611360A2B3F66DC003C20F6 /* Debug */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; BUNDLE_LOADER = "$(TEST_HOST)"; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAX.watchkitapp.tests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}.watchkitapp.tests"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = watchos; SWIFT_EMIT_LOC_STRINGS = NO; @@ -681,15 +756,15 @@ }; 1611360B2B3F66DC003C20F6 /* Release */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; BUNDLE_LOADER = "$(TEST_HOST)"; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAX.watchkitapp.tests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}.watchkitapp.tests"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = watchos; SWIFT_EMIT_LOC_STRINGS = NO; @@ -703,14 +778,14 @@ }; 1611360D2B3F66DC003C20F6 /* Debug */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAX.watchkitappUITests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}.watchkitappUITests"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = watchos; SWIFT_EMIT_LOC_STRINGS = NO; @@ -723,14 +798,14 @@ }; 1611360E2B3F66DC003C20F6 /* Release */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAX.watchkitappUITests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}.watchkitappUITests"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = watchos; SWIFT_EMIT_LOC_STRINGS = NO; @@ -744,6 +819,7 @@ }; 167B34832B05431F0076F261 /* Debug */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { ALWAYS_SEARCH_USER_PATHS = NO; ASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES; @@ -806,6 +882,7 @@ }; 167B34842B05431F0076F261 /* Release */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { ALWAYS_SEARCH_USER_PATHS = NO; ASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES; @@ -860,16 +937,17 @@ }; 167B34862B05431F0076F261 /* Debug */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor; CODE_SIGN_ENTITLEMENTS = WhisperAX/Resources/WhisperAX.entitlements; + CODE_SIGN_IDENTITY = "Apple Development"; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; DEAD_CODE_STRIPPING = YES; DEVELOPMENT_ASSET_PATHS = "\"WhisperAX/Preview Content\""; - DEVELOPMENT_TEAM = PP83DTRKSA; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; ENABLE_HARDENED_RUNTIME = YES; ENABLE_PREVIEWS = YES; GENERATE_INFOPLIST_FILE = YES; @@ -890,9 +968,10 @@ LD_RUNPATH_SEARCH_PATHS = "@executable_path/Frameworks"; "LD_RUNPATH_SEARCH_PATHS[sdk=macosx*]" = "@executable_path/../Frameworks"; MACOSX_DEPLOYMENT_TARGET = 14.0; - MARKETING_VERSION = 0.3.2; + MARKETING_VERSION = 0.4.0; PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}"; PRODUCT_NAME = "$(TARGET_NAME)"; + PROVISIONING_PROFILE_SPECIFIER = ""; SDKROOT = auto; SUPPORTED_PLATFORMS = "iphoneos iphonesimulator macosx"; SUPPORTS_MACCATALYST = NO; @@ -906,8 +985,8 @@ }; 167B34872B05431F0076F261 /* Release */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor; CODE_SIGN_ENTITLEMENTS = WhisperAX/Resources/WhisperAX.entitlements; @@ -915,7 +994,7 @@ CURRENT_PROJECT_VERSION = 1; DEAD_CODE_STRIPPING = YES; DEVELOPMENT_ASSET_PATHS = "\"WhisperAX/Preview Content\""; - DEVELOPMENT_TEAM = PP83DTRKSA; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; ENABLE_HARDENED_RUNTIME = YES; ENABLE_PREVIEWS = YES; GENERATE_INFOPLIST_FILE = YES; @@ -936,8 +1015,8 @@ LD_RUNPATH_SEARCH_PATHS = "@executable_path/Frameworks"; "LD_RUNPATH_SEARCH_PATHS[sdk=macosx*]" = "@executable_path/../Frameworks"; MACOSX_DEPLOYMENT_TARGET = 14.0; - MARKETING_VERSION = 0.3.2; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAX; + MARKETING_VERSION = 0.4.0; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAX${DEVELOPMENT_TEAM}"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = auto; SUPPORTED_PLATFORMS = "iphoneos iphonesimulator macosx"; @@ -951,18 +1030,18 @@ }; 167B34892B05431F0076F261 /* Debug */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; BUNDLE_LOADER = "$(TEST_HOST)"; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; DEAD_CODE_STRIPPING = YES; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; IPHONEOS_DEPLOYMENT_TARGET = 17.0; MACOSX_DEPLOYMENT_TARGET = 14.0; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAXTests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAXTests${DEVELOPMENT_TEAM}"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = auto; SUPPORTED_PLATFORMS = "iphoneos iphonesimulator macosx"; @@ -975,18 +1054,18 @@ }; 167B348A2B05431F0076F261 /* Release */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; BUNDLE_LOADER = "$(TEST_HOST)"; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; DEAD_CODE_STRIPPING = YES; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; IPHONEOS_DEPLOYMENT_TARGET = 17.0; MACOSX_DEPLOYMENT_TARGET = 14.0; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAXTests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAXTests${DEVELOPMENT_TEAM}"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = auto; SUPPORTED_PLATFORMS = "iphoneos iphonesimulator macosx"; @@ -999,17 +1078,17 @@ }; 167B348C2B05431F0076F261 /* Debug */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; DEAD_CODE_STRIPPING = YES; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; IPHONEOS_DEPLOYMENT_TARGET = 17.0; MACOSX_DEPLOYMENT_TARGET = 14.0; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAXUITests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAXUITests${DEVELOPMENT_TEAM}"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = auto; SUPPORTED_PLATFORMS = "iphoneos iphonesimulator macosx"; @@ -1022,17 +1101,17 @@ }; 167B348D2B05431F0076F261 /* Release */ = { isa = XCBuildConfiguration; + baseConfigurationReference = 169144182CCEEE87009903CA /* Debug.xcconfig */; buildSettings = { - ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; CODE_SIGN_STYLE = Automatic; CURRENT_PROJECT_VERSION = 1; DEAD_CODE_STRIPPING = YES; - DEVELOPMENT_TEAM = JSGZDY54HP; + DEVELOPMENT_TEAM = "${DEVELOPMENT_TEAM}"; GENERATE_INFOPLIST_FILE = YES; IPHONEOS_DEPLOYMENT_TARGET = 17.0; MACOSX_DEPLOYMENT_TARGET = 14.0; MARKETING_VERSION = 1.0; - PRODUCT_BUNDLE_IDENTIFIER = com.argmax.whisperkit.WhisperAXUITests; + PRODUCT_BUNDLE_IDENTIFIER = "com.argmax.whisperkit.WhisperAXUITests${DEVELOPMENT_TEAM}"; PRODUCT_NAME = "$(TARGET_NAME)"; SDKROOT = auto; SUPPORTED_PLATFORMS = "iphoneos iphonesimulator macosx"; @@ -1118,17 +1197,6 @@ }; /* End XCLocalSwiftPackageReference section */ -/* Begin XCRemoteSwiftPackageReference section */ - 16D581062B4F7DCE000C0AB0 /* XCRemoteSwiftPackageReference "swift-markdown-ui" */ = { - isa = XCRemoteSwiftPackageReference; - repositoryURL = "https://github.com/gonzalezreal/swift-markdown-ui.git"; - requirement = { - kind = upToNextMajorVersion; - minimumVersion = 2.3.0; - }; - }; -/* End XCRemoteSwiftPackageReference section */ - /* Begin XCSwiftPackageProductDependency section */ 1611360F2B3F6C68003C20F6 /* WhisperKit */ = { isa = XCSwiftPackageProductDependency; diff --git a/Examples/WhisperAX/WhisperAX.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved b/Examples/WhisperAX/WhisperAX.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved index 8c640f8..95f8598 100644 --- a/Examples/WhisperAX/WhisperAX.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved +++ b/Examples/WhisperAX/WhisperAX.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved @@ -1,15 +1,6 @@ { - "originHash" : "cd17206b47bb810af9459722192530e3838d8e6629a970988e32a432aaa05f6e", + "originHash" : "831ad63194a5262b2549d58e383a520f9cbbc80b4a75660fbbcc56d65edfdab4", "pins" : [ - { - "identity" : "networkimage", - "kind" : "remoteSourceControl", - "location" : "https://github.com/gonzalezreal/NetworkImage", - "state" : { - "revision" : "7aff8d1b31148d32c5933d75557d42f6323ee3d1", - "version" : "6.0.0" - } - }, { "identity" : "swift-argument-parser", "kind" : "remoteSourceControl", @@ -19,15 +10,6 @@ "version" : "1.3.0" } }, - { - "identity" : "swift-markdown-ui", - "kind" : "remoteSourceControl", - "location" : "https://github.com/gonzalezreal/swift-markdown-ui.git", - "state" : { - "revision" : "ae799d015a5374708f7b4c85f3294c05f2a564e2", - "version" : "2.3.0" - } - }, { "identity" : "swift-transformers", "kind" : "remoteSourceControl", diff --git a/Examples/WhisperAX/WhisperAX.xcodeproj/xcshareddata/xcschemes/WhisperAX.xcscheme b/Examples/WhisperAX/WhisperAX.xcodeproj/xcshareddata/xcschemes/WhisperAX.xcscheme new file mode 100644 index 0000000..236ed0e --- /dev/null +++ b/Examples/WhisperAX/WhisperAX.xcodeproj/xcshareddata/xcschemes/WhisperAX.xcscheme @@ -0,0 +1,108 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/Examples/WhisperAX/WhisperAX/Views/ContentView.swift b/Examples/WhisperAX/WhisperAX/Views/ContentView.swift index 664703b..48c4d57 100644 --- a/Examples/WhisperAX/WhisperAX/Views/ContentView.swift +++ b/Examples/WhisperAX/WhisperAX/Views/ContentView.swift @@ -12,16 +12,17 @@ import AVFoundation import CoreML struct ContentView: View { - @State var whisperKit: WhisperKit? = nil + @State private var whisperKit: WhisperKit? #if os(macOS) - @State var audioDevices: [AudioDevice]? = nil + @State private var audioDevices: [AudioDevice]? #endif - @State var isRecording: Bool = false - @State var isTranscribing: Bool = false - @State var currentText: String = "" - @State var currentChunks: [Int: (chunkText: [String], fallbacks: Int)] = [:] + @State private var isRecording: Bool = false + @State private var isTranscribing: Bool = false + @State private var currentText: String = "" + @State private var currentChunks: [Int: (chunkText: [String], fallbacks: Int)] = [:] // TODO: Make this configurable in the UI - @State var modelStorage: String = "huggingface/models/argmaxinc/whisperkit-coreml" + @State private var modelStorage: String = "huggingface/models/argmaxinc/whisperkit-coreml" + @State private var appStartTime = Date() // MARK: Model management @@ -49,9 +50,10 @@ struct ContentView: View { @AppStorage("compressionCheckWindow") private var compressionCheckWindow: Double = 60 @AppStorage("sampleLength") private var sampleLength: Double = 224 @AppStorage("silenceThreshold") private var silenceThreshold: Double = 0.3 + @AppStorage("realtimeDelayInterval") private var realtimeDelayInterval: Double = 1 @AppStorage("useVAD") private var useVAD: Bool = true @AppStorage("tokenConfirmationsNeeded") private var tokenConfirmationsNeeded: Double = 2 - @AppStorage("concurrentWorkerCount") private var concurrentWorkerCount: Int = 4 + @AppStorage("concurrentWorkerCount") private var concurrentWorkerCount: Double = 4 @AppStorage("chunkingStrategy") private var chunkingStrategy: ChunkingStrategy = .vad @AppStorage("encoderComputeUnits") private var encoderComputeUnits: MLComputeUnits = .cpuAndNeuralEngine @AppStorage("decoderComputeUnits") private var decoderComputeUnits: MLComputeUnits = .cpuAndNeuralEngine @@ -61,6 +63,7 @@ struct ContentView: View { @State private var loadingProgressValue: Float = 0.0 @State private var specializationProgressRatio: Float = 0.7 @State private var isFilePickerPresented = false + @State private var modelLoadingTime: TimeInterval = 0 @State private var firstTokenTime: TimeInterval = 0 @State private var pipelineStart: TimeInterval = 0 @State private var effectiveRealTimeFactor: TimeInterval = 0 @@ -96,9 +99,9 @@ struct ContentView: View { @State private var columnVisibility: NavigationSplitViewVisibility = .all @State private var showComputeUnits: Bool = true @State private var showAdvancedOptions: Bool = false - @State private var transcriptionTask: Task? = nil + @State private var transcriptionTask: Task? @State private var selectedCategoryId: MenuItem.ID? - @State private var transcribeTask: Task? = nil + @State private var transcribeTask: Task? struct MenuItem: Identifiable, Hashable { var id = UUID() @@ -112,7 +115,7 @@ struct ContentView: View { ] private var isStreamMode: Bool { - self.selectedCategoryId == menu.first(where: { $0.name == "Stream" })?.id + selectedCategoryId == menu.first(where: { $0.name == "Stream" })?.id } func getComputeOptions() -> ModelComputeOptions { @@ -180,6 +183,25 @@ struct ContentView: View { } .disabled(modelState != .loaded) .foregroundColor(modelState != .loaded ? .secondary : .primary) + + Spacer() + + // New section for app and device info + VStack(alignment: .leading, spacing: 4) { + let version = Bundle.main.infoDictionary?["CFBundleShortVersionString"] as? String ?? "Unknown" + let build = Bundle.main.infoDictionary?["CFBundleVersion"] as? String ?? "Unknown" + Text("App Version: \(version) (\(build))") + #if os(iOS) + Text("Device Model: \(WhisperKit.deviceName())") + Text("OS Version: \(UIDevice.current.systemVersion)") + #elseif os(macOS) + Text("Device Model: \(WhisperKit.deviceName())") + Text("OS Version: \(ProcessInfo.processInfo.operatingSystemVersionString)") + #endif + } + .font(.system(.caption, design: .monospaced)) + .foregroundColor(.secondary) + .padding(.vertical) } .navigationTitle("WhisperAX") .navigationSplitViewColumnWidth(min: 300, ideal: 350) @@ -230,6 +252,10 @@ struct ContentView: View { .onAppear { #if os(macOS) selectedCategoryId = menu.first(where: { $0.name == selectedTab })?.id + #else + if UIDevice.current.userInterfaceIdiom == .pad { + selectedCategoryId = menu.first(where: { $0.name == selectedTab })?.id + } #endif fetchModels() } @@ -349,7 +375,7 @@ struct ContentView: View { Spacer() - if availableModels.count > 0 { + if !availableModels.isEmpty { Picker("", selection: $selectedModel) { ForEach(availableModels, id: \.self) { model in HStack { @@ -375,7 +401,7 @@ struct ContentView: View { }) .help("Delete model") .buttonStyle(BorderlessButtonStyle()) - .disabled(localModels.count == 0) + .disabled(localModels.isEmpty) .disabled(!localModels.contains(selectedModel)) #if os(macOS) @@ -493,7 +519,7 @@ struct ContentView: View { Group { #if os(macOS) HStack { - if let audioDevices = audioDevices, audioDevices.count > 0 { + if let audioDevices = audioDevices, !audioDevices.isEmpty { Picker("", selection: $selectedAudioInput) { ForEach(audioDevices, id: \.self) { device in Text(device.name).tag(device.name) @@ -806,21 +832,29 @@ struct ContentView: View { } .padding(.horizontal) - HStack { - Text("Chunking Strategy") - InfoButton("Select the strategy to use for chunking audio data. If VAD is selected, the audio will be chunked based on voice activity (split on silent portions).") - Spacer() - Picker("", selection: $chunkingStrategy) { - Text("None").tag(ChunkingStrategy.none) - Text("VAD").tag(ChunkingStrategy.vad) + VStack { + HStack { + Text("Chunking Strategy") + InfoButton("Select the strategy to use for chunking audio data. If VAD is selected, the audio will be chunked based on voice activity (split on silent portions).") + Spacer() + Picker("", selection: $chunkingStrategy) { + Text("None").tag(ChunkingStrategy.none) + Text("VAD").tag(ChunkingStrategy.vad) + } + .pickerStyle(SegmentedPickerStyle()) + } + HStack { + Text("Workers:") + Slider(value: $concurrentWorkerCount, in: 0...32, step: 1) + Text(concurrentWorkerCount.formatted(.number)) + InfoButton("How many workers to run transcription concurrently. Higher values increase memory usage but saturate the selected compute unit more, resulting in faster transcriptions. A value of 0 will use unlimited workers.") } - .pickerStyle(SegmentedPickerStyle()) } .padding(.horizontal) .padding(.bottom) VStack { - Text("Starting Temperature:") + Text("Starting Temperature") HStack { Slider(value: $temperatureStart, in: 0...1, step: 0.1) Text(temperatureStart.formatted(.number)) @@ -830,7 +864,7 @@ struct ContentView: View { .padding(.horizontal) VStack { - Text("Max Fallback Count:") + Text("Max Fallback Count") HStack { Slider(value: $fallbackCount, in: 0...5, step: 1) Text(fallbackCount.formatted(.number)) @@ -873,6 +907,17 @@ struct ContentView: View { } .padding(.horizontal) + VStack { + Text("Realtime Delay Interval") + HStack { + Slider(value: $realtimeDelayInterval, in: 0...30, step: 1) + Text(realtimeDelayInterval.formatted(.number)) + .frame(width: 30) + InfoButton("Controls how long to wait for audio buffer to fill before running successive loops in streaming mode.\nHigher values will reduce the number of loops run per second, saving battery at the cost of higher latency.") + } + } + .padding(.horizontal) + Section(header: Text("Experimental")) { HStack { Text("Eager Streaming Mode") @@ -918,7 +963,7 @@ struct ContentView: View { var body: some View { Button(action: { - self.showInfo = true + showInfo = true }) { Image(systemName: "info.circle") .foregroundColor(.blue) @@ -1163,7 +1208,7 @@ struct ContentView: View { func transcribeFile(path: String) { resetState() whisperKit?.audioProcessor = AudioProcessor() - self.transcribeTask = Task { + transcribeTask = Task { isTranscribing = true do { try await transcribeCurrentFile(path: path) @@ -1195,8 +1240,8 @@ struct ContentView: View { var deviceId: DeviceID? #if os(macOS) - if self.selectedAudioInput != "No Audio Input", - let devices = self.audioDevices, + if selectedAudioInput != "No Audio Input", + let devices = audioDevices, let device = devices.first(where: { $0.name == selectedAudioInput }) { deviceId = device.id @@ -1234,7 +1279,7 @@ struct ContentView: View { // If not looping, transcribe the full buffer if !loop { - self.transcribeTask = Task { + transcribeTask = Task { isTranscribing = true do { try await transcribeCurrentBuffer() @@ -1258,7 +1303,7 @@ struct ContentView: View { hypothesisText = "" } - if unconfirmedSegments.count > 0 { + if !unconfirmedSegments.isEmpty { confirmedSegments.append(contentsOf: unconfirmedSegments) unconfirmedSegments = [] } @@ -1274,12 +1319,11 @@ struct ContentView: View { let loadingStart = Date() let audioFileSamples = try await Task { try autoreleasepool { - return try AudioProcessor.loadAudioAsFloatArray(fromPath: path) + try AudioProcessor.loadAudioAsFloatArray(fromPath: path) } }.value Logging.debug("Loaded audio file in \(Date().timeIntervalSince(loadingStart)) seconds") - let transcription = try await transcribeAudioSamples(audioFileSamples) await MainActor.run { @@ -1288,15 +1332,16 @@ struct ContentView: View { return } - self.tokensPerSecond = transcription?.timings.tokensPerSecond ?? 0 - self.effectiveRealTimeFactor = transcription?.timings.realTimeFactor ?? 0 - self.effectiveSpeedFactor = transcription?.timings.speedFactor ?? 0 - self.currentEncodingLoops = Int(transcription?.timings.totalEncodingRuns ?? 0) - self.firstTokenTime = transcription?.timings.firstTokenTime ?? 0 - self.pipelineStart = transcription?.timings.pipelineStart ?? 0 - self.currentLag = transcription?.timings.decodingLoop ?? 0 + tokensPerSecond = transcription?.timings.tokensPerSecond ?? 0 + effectiveRealTimeFactor = transcription?.timings.realTimeFactor ?? 0 + effectiveSpeedFactor = transcription?.timings.speedFactor ?? 0 + currentEncodingLoops = Int(transcription?.timings.totalEncodingRuns ?? 0) + firstTokenTime = transcription?.timings.firstTokenTime ?? 0 + modelLoadingTime = transcription?.timings.modelLoading ?? 0 + pipelineStart = transcription?.timings.pipelineStart ?? 0 + currentLag = transcription?.timings.decodingLoop ?? 0 - self.confirmedSegments = segments + confirmedSegments = segments } } @@ -1320,7 +1365,7 @@ struct ContentView: View { withoutTimestamps: !enableTimestamps, wordTimestamps: true, clipTimestamps: seekClip, - concurrentWorkerCount: concurrentWorkerCount, + concurrentWorkerCount: Int(concurrentWorkerCount), chunkingStrategy: chunkingStrategy ) @@ -1332,7 +1377,7 @@ struct ContentView: View { // First check if this is a new window for the same chunk, append if so var updatedChunk = (chunkText: [progress.text], fallbacks: fallbacks) - if var currentChunk = self.currentChunks[chunkId], let previousChunkText = currentChunk.chunkText.last { + if var currentChunk = currentChunks[chunkId], let previousChunkText = currentChunk.chunkText.last { if progress.text.count >= previousChunkText.count { // This is the same window of an existing chunk, so we just update the last value currentChunk.chunkText[currentChunk.chunkText.endIndex - 1] = progress.text @@ -1352,12 +1397,12 @@ struct ContentView: View { } // Set the new text for the chunk - self.currentChunks[chunkId] = updatedChunk - let joinedChunks = self.currentChunks.sorted { $0.key < $1.key }.flatMap { $0.value.chunkText }.joined(separator: "\n") + currentChunks[chunkId] = updatedChunk + let joinedChunks = currentChunks.sorted { $0.key < $1.key }.flatMap { $0.value.chunkText }.joined(separator: "\n") - self.currentText = joinedChunks - self.currentFallbacks = fallbacks - self.currentDecodingLoops += 1 + currentText = joinedChunks + currentFallbacks = fallbacks + currentDecodingLoops += 1 } // Check early stopping @@ -1395,7 +1440,7 @@ struct ContentView: View { transcriptionTask = Task { while isRecording && isTranscribing { do { - try await transcribeCurrentBuffer() + try await transcribeCurrentBuffer(delayInterval: Float(realtimeDelayInterval)) } catch { print("Error: \(error.localizedDescription)") break @@ -1409,7 +1454,7 @@ struct ContentView: View { transcriptionTask?.cancel() } - func transcribeCurrentBuffer() async throws { + func transcribeCurrentBuffer(delayInterval: Float = 1.0) async throws { guard let whisperKit = whisperKit else { return } // Retrieve the current audio buffer from the audio processor @@ -1419,8 +1464,8 @@ struct ContentView: View { let nextBufferSize = currentBuffer.count - lastBufferSize let nextBufferSeconds = Float(nextBufferSize) / Float(WhisperKit.sampleRate) - // Only run the transcribe if the next buffer has at least 1 second of audio - guard nextBufferSeconds > 1 else { + // Only run the transcribe if the next buffer has at least `delayInterval` seconds of audio + guard nextBufferSeconds > delayInterval else { await MainActor.run { if currentText == "" { currentText = "Waiting for speech..." @@ -1469,16 +1514,17 @@ struct ContentView: View { let transcription = try await transcribeEagerMode(Array(currentBuffer)) await MainActor.run { currentText = "" - self.tokensPerSecond = transcription?.timings.tokensPerSecond ?? 0 - self.firstTokenTime = transcription?.timings.firstTokenTime ?? 0 - self.pipelineStart = transcription?.timings.pipelineStart ?? 0 - self.currentLag = transcription?.timings.decodingLoop ?? 0 - self.currentEncodingLoops = Int(transcription?.timings.totalEncodingRuns ?? 0) + tokensPerSecond = transcription?.timings.tokensPerSecond ?? 0 + firstTokenTime = transcription?.timings.firstTokenTime ?? 0 + modelLoadingTime = transcription?.timings.modelLoading ?? 0 + pipelineStart = transcription?.timings.pipelineStart ?? 0 + currentLag = transcription?.timings.decodingLoop ?? 0 + currentEncodingLoops = Int(transcription?.timings.totalEncodingRuns ?? 0) let totalAudio = Double(currentBuffer.count) / Double(WhisperKit.sampleRate) - self.totalInferenceTime = transcription?.timings.fullPipeline ?? 0 - self.effectiveRealTimeFactor = Double(totalInferenceTime) / totalAudio - self.effectiveSpeedFactor = totalAudio / Double(totalInferenceTime) + totalInferenceTime = transcription?.timings.fullPipeline ?? 0 + effectiveRealTimeFactor = Double(totalInferenceTime) / totalAudio + effectiveSpeedFactor = totalAudio / Double(totalInferenceTime) } } else { // Run realtime transcribe using timestamp tokens directly @@ -1491,16 +1537,17 @@ struct ContentView: View { return } - self.tokensPerSecond = transcription?.timings.tokensPerSecond ?? 0 - self.firstTokenTime = transcription?.timings.firstTokenTime ?? 0 - self.pipelineStart = transcription?.timings.pipelineStart ?? 0 - self.currentLag = transcription?.timings.decodingLoop ?? 0 - self.currentEncodingLoops += Int(transcription?.timings.totalEncodingRuns ?? 0) + tokensPerSecond = transcription?.timings.tokensPerSecond ?? 0 + firstTokenTime = transcription?.timings.firstTokenTime ?? 0 + modelLoadingTime = transcription?.timings.modelLoading ?? 0 + pipelineStart = transcription?.timings.pipelineStart ?? 0 + currentLag = transcription?.timings.decodingLoop ?? 0 + currentEncodingLoops += Int(transcription?.timings.totalEncodingRuns ?? 0) let totalAudio = Double(currentBuffer.count) / Double(WhisperKit.sampleRate) - self.totalInferenceTime += transcription?.timings.fullPipeline ?? 0 - self.effectiveRealTimeFactor = Double(totalInferenceTime) / totalAudio - self.effectiveSpeedFactor = totalAudio / Double(totalInferenceTime) + totalInferenceTime += transcription?.timings.fullPipeline ?? 0 + effectiveRealTimeFactor = Double(totalInferenceTime) / totalAudio + effectiveSpeedFactor = totalAudio / Double(totalInferenceTime) // Logic for moving segments to confirmedSegments if segments.count > requiredSegmentsForConfirmation { @@ -1518,17 +1565,17 @@ struct ContentView: View { // Add confirmed segments to the confirmedSegments array for segment in confirmedSegmentsArray { - if !self.confirmedSegments.contains(segment: segment) { - self.confirmedSegments.append(segment) + if !confirmedSegments.contains(segment: segment) { + confirmedSegments.append(segment) } } } // Update transcriptions to reflect the remaining segments - self.unconfirmedSegments = remainingSegments + unconfirmedSegments = remainingSegments } else { // Handle the case where segments are fewer or equal to required - self.unconfirmedSegments = segments + unconfirmedSegments = segments } } } @@ -1559,7 +1606,8 @@ struct ContentView: View { skipSpecialTokens: !enableSpecialCharacters, withoutTimestamps: !enableTimestamps, wordTimestamps: true, // required for eager mode - firstTokenLogProbThreshold: -1.5 // higher threshold to prevent fallbacks from running to often + firstTokenLogProbThreshold: -1.5, // higher threshold to prevent fallbacks from running to often + chunkingStrategy: ChunkingStrategy.none ) // Early stopping checks @@ -1567,15 +1615,15 @@ struct ContentView: View { DispatchQueue.main.async { let fallbacks = Int(progress.timings.totalDecodingFallbacks) if progress.text.count < currentText.count { - if fallbacks == self.currentFallbacks { + if fallbacks == currentFallbacks { // self.unconfirmedText.append(currentText) } else { print("Fallback occured: \(fallbacks)") } } - self.currentText = progress.text - self.currentFallbacks = fallbacks - self.currentDecodingLoops += 1 + currentText = progress.text + currentFallbacks = fallbacks + currentDecodingLoops += 1 } // Check early stopping let currentTokens = progress.tokens diff --git a/Examples/WhisperAX/WhisperAXTests/WhisperKitTests b/Examples/WhisperAX/WhisperAXTests/WhisperKitTests new file mode 120000 index 0000000..63d1379 --- /dev/null +++ b/Examples/WhisperAX/WhisperAXTests/WhisperKitTests @@ -0,0 +1 @@ +../../../Tests/WhisperKitTests \ No newline at end of file diff --git a/Makefile b/Makefile index 8b5af9e..d19a736 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -.PHONY: setup setup-huggingface-cli setup-model-repo download-models download-model build build-cli test clean-package-caches +.PHONY: setup setup-huggingface-cli setup-model-repo download-models download-model build build-cli test clean-package-caches list-devices benchmark-connected-devices benchmark-device benchmark-devices extract-xcresult PIP_COMMAND := pip3 PYTHON_COMMAND := python3 @@ -8,13 +8,14 @@ MODEL_REPO := argmaxinc/whisperkit-coreml MODEL_REPO_DIR := ./Models/whisperkit-coreml BASE_COMPILED_DIR := ./Models +GIT_HASH := $(shell git rev-parse --short HEAD) setup: @echo "Setting up environment..." @which $(PIP_COMMAND) @which $(PYTHON_COMMAND) @echo "Checking for Homebrew..." - @which brew > /dev/null || (echo "Error: Homebrew is not installed. Install it form here https://brew.sh and try again" && exit 1) + @which brew > /dev/null || (echo "Error: Homebrew is not installed. Install it from https://brew.sh and try again" && exit 1) @echo "Homebrew is installed." @echo "Checking for huggingface-cli..." @which huggingface-cli > /dev/null || (echo "Installing huggingface-cli..." && brew install huggingface-cli) @@ -25,9 +26,28 @@ setup: @echo "Checking for trash..." @which trash > /dev/null || (echo "Installing trash..." && brew install trash) @echo "trash is installed." + @echo "Checking for fastlane" + @which fastlane > /dev/null || (echo "Installing fastlane..." && brew install fastlane) + @echo "fastlane is installed." + @$(MAKE) generate-whisperax-xcconfig @echo "Done 🚀" +generate-whisperax-xcconfig: + @echo "Updating DEVELOPMENT_TEAM in Examples/WhisperAX/Debug.xcconfig..." + @TEAM_ID=$$(defaults read com.apple.dt.Xcode IDEProvisioningTeamManagerLastSelectedTeamID 2>/dev/null); \ + if [ -z "$$TEAM_ID" ]; then \ + echo "Error: No Development Team ID found. Please log into Xcode with your Apple ID and select a team."; \ + else \ + if grep -q '^DEVELOPMENT_TEAM' Examples/WhisperAX/Debug.xcconfig; then \ + sed -i '' "s/^\(DEVELOPMENT_TEAM *= *\).*/\1$$TEAM_ID/" Examples/WhisperAX/Debug.xcconfig; \ + else \ + echo "DEVELOPMENT_TEAM=$$TEAM_ID" >> Examples/WhisperAX/Debug.xcconfig; \ + fi; \ + echo "DEVELOPMENT_TEAM has been updated in Examples/WhisperAX/Debug.xcconfig with your Development Team ID: $$TEAM_ID"; \ + fi + + setup-huggingface-cli: @if huggingface-cli whoami; then \ echo "Already logged in to Hugging Face."; \ @@ -56,12 +76,14 @@ setup-model-repo: git clone https://huggingface.co/$(MODEL_REPO) $(MODEL_REPO_DIR); \ fi + # Download all models download-models: setup-model-repo @echo "Downloading all models..." @cd $(MODEL_REPO_DIR) && \ git lfs pull + # Download a specific model download-model: @if [ -z "$(MODEL)" ]; then \ @@ -88,6 +110,31 @@ test: @echo "Running tests..." @swift test -v + +list-devices: + fastlane ios list_devices + + +# Usage: +# make benchmark-devices # Benchmark all connected devices +# make benchmark-devices DEBUG=true # Benchmark all connected devices with small test matrix +# make benchmark-devices DEVICES="iPhone 15 Pro Max,My Mac" # Benchmark specific device names from `make list-devices` +DEVICES ?= +DEBUG ?= false +benchmark-devices: + @if [ -n "$(DEVICES)" ]; then \ + echo "Benchmarking specific devices: $(DEVICES)"; \ + fastlane benchmark devices:"$(DEVICES)" debug:$(DEBUG); \ + else \ + echo "Benchmarking all connected devices"; \ + fastlane benchmark debug:$(DEBUG); \ + fi + +upload-benchmark-results: + @echo "Uploading benchmark results..." + @fastlane upload_results + clean-package-caches: - @trash ~/Library/Caches/org.swift.swiftpm/repositories - @trash ~/Library/Developer/Xcode/DerivedData + @trash ~/Library/Developer/Xcode/DerivedData/WhisperKit* + @swift package purge-cache + @swift package reset \ No newline at end of file diff --git a/Sources/WhisperKit/Core/Configurations.swift b/Sources/WhisperKit/Core/Configurations.swift index 7ff1a9e..1547b1f 100644 --- a/Sources/WhisperKit/Core/Configurations.swift +++ b/Sources/WhisperKit/Core/Configurations.swift @@ -28,6 +28,7 @@ open class WhisperKitConfig { public var textDecoder: (any TextDecoding)? public var logitsFilters: [any LogitsFiltering]? public var segmentSeeker: (any SegmentSeeking)? + public var voiceActivityDetector: VoiceActivityDetector? /// Enable extra verbosity for logging public var verbose: Bool @@ -55,6 +56,7 @@ open class WhisperKitConfig { textDecoder: (any TextDecoding)? = nil, logitsFilters: [any LogitsFiltering]? = nil, segmentSeeker: (any SegmentSeeking)? = nil, + voiceActivityDetector: VoiceActivityDetector? = nil, verbose: Bool = true, logLevel: Logging.LogLevel = .info, prewarm: Bool? = nil, @@ -74,6 +76,7 @@ open class WhisperKitConfig { self.textDecoder = textDecoder self.logitsFilters = logitsFilters self.segmentSeeker = segmentSeeker + self.voiceActivityDetector = voiceActivityDetector self.verbose = verbose self.logLevel = logLevel self.prewarm = prewarm @@ -115,7 +118,7 @@ open class WhisperKitConfig { /// - noSpeechThreshold: If the no speech probability is higher than this value AND the average log /// probability over sampled tokens is below `logProbThreshold`, consider the segment as silent. @available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) -public struct DecodingOptions { +public struct DecodingOptions: Codable { public var verbose: Bool public var task: DecodingTask public var language: String? @@ -142,7 +145,6 @@ public struct DecodingOptions { public var noSpeechThreshold: Float? public var concurrentWorkerCount: Int public var chunkingStrategy: ChunkingStrategy? - public var voiceActivityDetector: VoiceActivityDetector? public init( verbose: Bool = false, @@ -170,8 +172,7 @@ public struct DecodingOptions { firstTokenLogProbThreshold: Float? = -1.5, noSpeechThreshold: Float? = 0.6, concurrentWorkerCount: Int = 16, - chunkingStrategy: ChunkingStrategy? = nil, - voiceActivityDetector: VoiceActivityDetector? = nil + chunkingStrategy: ChunkingStrategy? = nil ) { self.verbose = verbose self.task = task @@ -199,6 +200,5 @@ public struct DecodingOptions { self.noSpeechThreshold = noSpeechThreshold self.concurrentWorkerCount = concurrentWorkerCount self.chunkingStrategy = chunkingStrategy - self.voiceActivityDetector = voiceActivityDetector } } diff --git a/Sources/WhisperKit/Core/Models.swift b/Sources/WhisperKit/Core/Models.swift index 0417adf..5ca6995 100644 --- a/Sources/WhisperKit/Core/Models.swift +++ b/Sources/WhisperKit/Core/Models.swift @@ -282,7 +282,7 @@ public struct AudioChunk { // MARK: - Decoding -public enum DecodingTask: CustomStringConvertible, CaseIterable { +public enum DecodingTask: Codable, CustomStringConvertible, CaseIterable { case transcribe case translate @@ -355,7 +355,7 @@ public struct DecodingCache { } } -public enum ChunkingStrategy: String, CaseIterable { +public enum ChunkingStrategy: String, Codable, CaseIterable { case none case vad } @@ -649,6 +649,8 @@ public struct TranscriptionTimings: Codable { public var prewarmLoadTime: TimeInterval public var encoderLoadTime: TimeInterval public var decoderLoadTime: TimeInterval + public var encoderSpecializationTime: TimeInterval + public var decoderSpecializationTime: TimeInterval public var tokenizerLoadTime: TimeInterval public var audioLoading: TimeInterval public var audioProcessing: TimeInterval @@ -693,6 +695,8 @@ public struct TranscriptionTimings: Codable { prewarmLoadTime: TimeInterval = 0, encoderLoadTime: TimeInterval = 0, decoderLoadTime: TimeInterval = 0, + encoderSpecializationTime: TimeInterval = 0, + decoderSpecializationTime: TimeInterval = 0, tokenizerLoadTime: TimeInterval = 0, audioLoading: TimeInterval = 0, audioProcessing: TimeInterval = 0, @@ -726,6 +730,8 @@ public struct TranscriptionTimings: Codable { self.prewarmLoadTime = prewarmLoadTime self.encoderLoadTime = encoderLoadTime self.decoderLoadTime = decoderLoadTime + self.encoderSpecializationTime = encoderSpecializationTime + self.decoderSpecializationTime = decoderSpecializationTime self.tokenizerLoadTime = tokenizerLoadTime self.audioLoading = audioLoading self.audioProcessing = audioProcessing diff --git a/Sources/WhisperKit/Core/WhisperKit.swift b/Sources/WhisperKit/Core/WhisperKit.swift index b37ba5d..88a665f 100644 --- a/Sources/WhisperKit/Core/WhisperKit.swift +++ b/Sources/WhisperKit/Core/WhisperKit.swift @@ -24,6 +24,7 @@ open class WhisperKit { public var textDecoder: any TextDecoding public var logitsFilters: [any LogitsFiltering] public var segmentSeeker: any SegmentSeeking + public var voiceActivityDetector: VoiceActivityDetector? /// Shapes public static let sampleRate: Int = 16000 @@ -49,6 +50,7 @@ open class WhisperKit { textDecoder = config.textDecoder ?? TextDecoder() logitsFilters = config.logitsFilters ?? [] segmentSeeker = config.segmentSeeker ?? SegmentSeeker() + voiceActivityDetector = config.voiceActivityDetector tokenizerFolder = config.tokenizerFolder useBackgroundDownloadSession = config.useBackgroundDownloadSession currentTimings = TranscriptionTimings() @@ -357,8 +359,13 @@ open class WhisperKit { computeUnits: modelCompute.textDecoderCompute, prewarmMode: prewarmMode ) - currentTimings.decoderLoadTime = CFAbsoluteTimeGetCurrent() - decoderLoadStart + if prewarmMode { + currentTimings.decoderSpecializationTime = CFAbsoluteTimeGetCurrent() - decoderLoadStart + } else { + currentTimings.decoderLoadTime = CFAbsoluteTimeGetCurrent() - decoderLoadStart + } + Logging.debug("Loaded text decoder in \(String(format: "%.2f", currentTimings.decoderLoadTime))s") } @@ -371,8 +378,13 @@ open class WhisperKit { computeUnits: modelCompute.audioEncoderCompute, prewarmMode: prewarmMode ) - currentTimings.encoderLoadTime = CFAbsoluteTimeGetCurrent() - encoderLoadStart - + + if prewarmMode { + currentTimings.encoderSpecializationTime = CFAbsoluteTimeGetCurrent() - encoderLoadStart + } else { + currentTimings.encoderLoadTime = CFAbsoluteTimeGetCurrent() - encoderLoadStart + } + Logging.debug("Loaded audio encoder in \(String(format: "%.2f", currentTimings.encoderLoadTime))s") } @@ -776,7 +788,7 @@ open class WhisperKit { switch (isChunkable, decodeOptions?.chunkingStrategy) { case (true, .vad): // We have some audio that will require multiple windows and a strategy to chunk them - let vad = decodeOptions?.voiceActivityDetector ?? EnergyVAD() + let vad = voiceActivityDetector ?? EnergyVAD() let chunker = VADAudioChunker(vad: vad) let audioChunks: [AudioChunk] = try await chunker.chunkAll( audioArray: audioArray, diff --git a/Tests/WhisperKitTests/Evaluate/DistanceCalculation.swift b/Tests/WhisperKitTests/Evaluate/DistanceCalculation.swift new file mode 100644 index 0000000..94e2fdd --- /dev/null +++ b/Tests/WhisperKitTests/Evaluate/DistanceCalculation.swift @@ -0,0 +1,218 @@ +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + +import Foundation + +/// Compute the last row of the edit distance dynamic programming matrix +/// between s1 and s2. +func computeLastRow(_ s1Chars: [Unicode.Scalar], _ s2Chars: [Unicode.Scalar]) -> [Int] { + var prevRow = Array(0...s2Chars.endIndex) + + for i in 1...s1Chars.endIndex { + var currentRow = [Int](repeating: 0, count: s2Chars.endIndex + 1) + currentRow[0] = i + + for j in 1...s2Chars.endIndex { + let cost = s1Chars[i - 1] == s2Chars[j - 1] ? 0 : 1 + currentRow[j] = min( + prevRow[j] + 1, // Deletion + currentRow[j - 1] + 1, // Insertion + prevRow[j - 1] + cost // Substitution + ) + } + prevRow = currentRow + } + + return prevRow +} + +func needlemanWunsch(_ xArray: [Unicode.Scalar], _ yArray: [Unicode.Scalar]) -> [EditOp] { + let m = xArray.count + let n = yArray.count + + var dp = [[Int]](repeating: [Int](repeating: 0, count: n + 1), count: m + 1) + for i in 1...m { + dp[i][0] = i + } + for j in 1...n { + dp[0][j] = j + } + + for i in 1...m { + for j in 1...n { + let cost = xArray[i - 1] == yArray[j - 1] ? 0 : 1 + dp[i][j] = min( + dp[i - 1][j] + 1, // Deletion + dp[i][j - 1] + 1, // Insertion + dp[i - 1][j - 1] + cost // Substitution + ) + } + } + + var i = m + var j = n + var ops = [EditOp]() + + while i > 0, j > 0 { + if dp[i][j] == dp[i - 1][j - 1], xArray[i - 1] == yArray[j - 1] { + // Match operation is omitted + i -= 1 + j -= 1 + } else if dp[i][j] == dp[i - 1][j - 1] + 1 { + ops.append(EditOp.replace) // Substitution + i -= 1 + j -= 1 + } else if dp[i][j] == dp[i][j - 1] + 1 { + ops.append(EditOp.insert) // Insertion + j -= 1 + } else { + ops.append(EditOp.delete) // Deletion + i -= 1 + } + } + + while i > 0 { + ops.append(EditOp.delete) + i -= 1 + } + while j > 0 { + ops.append(EditOp.insert) + j -= 1 + } + + return ops.reversed() +} + +func hirschberg(_ reference: [Unicode.Scalar], _ s2: [Unicode.Scalar]) -> [EditOp] { + func hirschbergRec(_ x: [Unicode.Scalar], _ y: [Unicode.Scalar]) -> [EditOp] { + let m = x.endIndex + let n = y.endIndex + + if m == 0 { + let result = y.map { _ in EditOp.insert } + return result + } + if n == 0 { + let result = x.map { _ in EditOp.delete } + return result + } + if m == 1 || n == 1 { + let result = needlemanWunsch(x, y) + return result + } + + let i = m / 2 + let xPrefix = Array(x[x.startIndex.. [EditOp] { + let n = sourceText.count + let m = targetText.count + let maxD = n + m + let vSize = 2 * maxD + 1 + var v = [Int](repeating: 0, count: vSize) + var trace = [[Int]]() + + let offset = maxD + + for d in 0...maxD { + let vSnapshot = v + for k in stride(from: -d, through: d, by: 2) { + let kIndex = k + offset + var x: Int + if k == -d || (k != d && v[kIndex - 1] < v[kIndex + 1]) { + x = v[kIndex + 1] + } else { + x = v[kIndex - 1] + 1 + } + var y = x - k + while x < n, y < m, sourceText[x] == targetText[y] { + x += 1 + y += 1 + } + v[kIndex] = x + if x >= n, y >= m { + trace.append(vSnapshot) + return backtrack(trace: trace, sourceText: sourceText, targetText: targetText) + } + } + trace.append(vSnapshot) + } + return [] +} + +func backtrack(trace: [[Int]], sourceText: [Unicode.Scalar], targetText: [Unicode.Scalar]) -> [EditOp] { + var editOps = [EditOp]() + let n = sourceText.count + let m = targetText.count + let offset = trace[0].count / 2 + var x = n + var y = m + + for d in stride(from: trace.count - 1, through: 0, by: -1) { + let v = trace[d] + let k = x - y + let kIndex = k + offset + + var prevK: Int + if k == -d || (k != d && v[kIndex - 1] < v[kIndex + 1]) { + prevK = k + 1 + } else { + prevK = k - 1 + } + let prevX = v[prevK + offset] + let prevY = prevX - prevK + + while x > prevX, y > prevY { + // Match or Replace + if sourceText[x - 1] == targetText[y - 1] { + editOps.append(.blank) + } else { + editOps.append(.replace) + } + x -= 1 + y -= 1 + } + + if d > 0 { + if x == prevX { + // Insertion + editOps.append(.insert) + y -= 1 + } else { + // Deletion + editOps.append(.delete) + x -= 1 + } + } + } + + return editOps.reversed() +} diff --git a/Tests/WhisperKitTests/Evaluate/NormalizeEn.swift b/Tests/WhisperKitTests/Evaluate/NormalizeEn.swift new file mode 100644 index 0000000..6ac2f74 --- /dev/null +++ b/Tests/WhisperKitTests/Evaluate/NormalizeEn.swift @@ -0,0 +1,892 @@ +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + +import Foundation + +enum NormalizationUtils { + /// sentences = ["this is an example ", " hello goodbye ", " "] + /// ['this is an example ', " hello goodbye ", " "] + static func removeMultipleSpaces(sentences: [String]) -> [String] { + var replacedSentences = [String]() + for sentence in sentences { + // Define the pattern you want to replace + let pattern = "\\s\\s+" + + do { + let regex = try NSRegularExpression(pattern: pattern, options: []) + let replacedString = regex.stringByReplacingMatches( + in: sentence, + options: [], + range: NSRange(location: 0, length: sentence.utf16.count), + withTemplate: " " + ) + replacedSentences.append(replacedString) + } catch { + print("Error while creating regex: \(error)") + } + } + return replacedSentences + } + + /// [" this is an example ", " hello goodbye ", " "] + /// ['this is an example', "hello goodbye", ""] + static func strip(sentences: [String]) -> [String] { + var replacedSentences = [String]() + for sentence in sentences { + let replacedString = sentence.trimmingCharacters(in: .whitespaces) + replacedSentences.append(replacedString) + } + return replacedSentences + } + + /// ["hi", "this is an example"] + /// [['hi'], ['this', ' ', 'is', ' ', 'an', ' ', 'example']] + static func reduceToListOfListOfWordsWithSpaces(sentences: [String], wordDelimiter: String = " ") -> [[String]] { + func processString(_ sentence: String) -> [String] { + let processedString = sentence.map { String($0) }.reduce(into: [String]()) { result, char in + if char == wordDelimiter { + result.append(char) + } else if result.last == wordDelimiter || result.isEmpty { + result.append(char) + } else { + result[result.count - 1].append(char) + } + } + + return processedString + } + + return sentences.map { processString($0) } + } + + /// ["hi", "this is an example"] + /// [['hi'], ['this', 'is', 'an, 'example']] + static func reduceToListOfListOfWords(sentences: [String], word_delimiter: String = " ") -> [[String]] { + func processString(sentence: String) -> [[String]] { + return [sentence.components(separatedBy: word_delimiter).filter { !$0.isEmpty }] + } + + func processList(sentences: [String]) -> [[String]] { + var sentenceCollection = [[String]]() + for sentence in sentences { + let list_of_words = processString(sentence: sentence)[0] + if !list_of_words.isEmpty { + sentenceCollection.append(list_of_words) + } + } + return sentenceCollection + } + return processList(sentences: sentences) + } +} + +class EnglishNumberNormalizer { + /// Convert any spelled-out numbers into arabic numbers, while handling: + /// + /// - remove any commas + /// - keep the suffixes such as: `1960s`, `274th`, `32nd`, etc. + /// - spell out currency symbols after the number. e.g. `$20 million` -> `20000000 dollars` + /// - spell out `one` and `ones` + /// - interpret successive single-digit numbers as nominal: `one oh one` -> `101` + let zeros: Set + + let ones: [String: Int] + let onesPlural: [String: (Int, String)] + let onesOrdinal: [String: (Int, String)] + let onesSuffixed: [String: (Int, String)] + + let tens: [String: Int] + let tensPlural: [String: (Int, String)] + let tensOrdinal: [String: (Int, String)] + let tensSuffixed: [String: (Int, String)] + + let multipliers: [String: Int64] + let multipliersPlural: [String: (Int64, String)] + let multipliersOrdinal: [String: (Int64, String)] + let multipliersSuffixed: [String: (Int64, String)] + + let decimals: Set + let precedingPrefixers: [String: String] + let followingPrefixers: [String: String] + + let prefixes: Set + let suffixers: [String: Any] + let specials: Set + let words: Set + let literalWords: Set + + init() { + let zeros: Set = ["o", "oh", "zero"] + + let ones = Dictionary(uniqueKeysWithValues: [ + "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", + "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", + "eighteen", "nineteen", + ].enumerated().map { ($0.element, $0.offset + 1) }) + let onesPlural = Dictionary(uniqueKeysWithValues: + ones.map { name, value in + (name == "six" ? "sixes" : name + "s", (value, "s")) + } + ) + let onesOrdinal = { + var onesDictionary: [String: (Int, String)] = [ + "zeroth": (0, "th"), + "first": (1, "st"), + "second": (2, "nd"), + "third": (3, "rd"), + "fifth": (5, "th"), + "twelfth": (12, "th"), + ] + + let updatedOnes = ones.filter { name, value in + value > 3 && value != 5 && value != 12 + }.map { name, value in + (name + (name.hasSuffix("t") ? "h" : "th"), (value, "th")) + } + + for (key, value) in updatedOnes { + onesDictionary[key] = value + } + + return onesDictionary + }() + let onesSuffixed = onesPlural.merging(onesOrdinal) { $1 } + + let tens = [ + "twenty": 20, + "thirty": 30, + "forty": 40, + "fifty": 50, + "sixty": 60, + "seventy": 70, + "eighty": 80, + "ninety": 90, + ] + let tensPlural = Dictionary(uniqueKeysWithValues: tens.map { name, value in + (name.replacingOccurrences(of: "y", with: "ies"), (value, "s")) + }) + let tensOrdinal = Dictionary(uniqueKeysWithValues: tens.map { name, value in + (name.replacingOccurrences(of: "y", with: "ieth"), (value, "th")) + }) + let tensSuffixed = tensPlural.merging(tensOrdinal) { $1 } + + let multipliers: [String: Int64] = [ + "hundred": 100, + "thousand": 1000, + "million": 1_000_000, + "billion": 1_000_000_000, + "trillion": 1_000_000_000_000, + "quadrillion": 1_000_000_000_000_000, + "quintillion": 1_000_000_000_000_000_000, + ] + let multipliersPlural = Dictionary(uniqueKeysWithValues: multipliers.map { name, value in + (name + "s", (value, "s")) + }) + let multipliersOrdinal = Dictionary(uniqueKeysWithValues: multipliers.map { name, value in + (name + "th", (value, "th")) + }) + let multipliersSuffixed = multipliersPlural.merging(multipliersOrdinal) { $1 } + + let decimals = Set(ones.keys).union(tens.keys).union(zeros) + let precedingPrefixers: [String: String] = [ + "minus": "-", + "negative": "-", + "plus": "+", + "positive": "+", + ] + let followingPrefixers: [String: String] = [ + "pound": "£", + "pounds": "£", + "euro": "€", + "euros": "€", + "dollar": "$", + "dollars": "$", + "cent": "¢", + "cents": "¢", + ] + + let prefixes = Set(precedingPrefixers.values) + .union(followingPrefixers.values) + let suffixers: [String: Any] = [ + "per": ["cent": "%"], + "percent": "%", + ] + let specials: Set = ["and", "double", "triple", "point"] + let words = zeros.union(ones.keys) + .union(onesSuffixed.keys) + .union(tens.keys) + .union(tensSuffixed.keys) + .union(multipliers.keys) + .union(multipliersSuffixed.keys) + .union(precedingPrefixers.keys) + .union(followingPrefixers.keys) + .union(suffixers.keys) + .union(specials) + let literalWords: Set = ["one", "ones"] + + self.zeros = zeros + + self.ones = ones + self.onesPlural = onesPlural + self.onesOrdinal = onesOrdinal + self.onesSuffixed = onesSuffixed + + self.tens = tens + self.tensPlural = tensPlural + self.tensOrdinal = tensOrdinal + self.tensSuffixed = tensSuffixed + + self.multipliers = multipliers + self.multipliersPlural = multipliersPlural + self.multipliersOrdinal = multipliersOrdinal + self.multipliersSuffixed = multipliersSuffixed + + self.decimals = decimals + self.precedingPrefixers = precedingPrefixers + self.followingPrefixers = followingPrefixers + + self.prefixes = prefixes + self.suffixers = suffixers + self.specials = specials + self.words = words + self.literalWords = literalWords + } + + func processWords(_ words: [String]) -> [String] { + var prefix: String? // Stores currency/sign prefixes (e.g., "$", "+", "-") + var value: String? // Accumulates the current number being processed + var skip = false // Controls skipping next word in special cases + var results: [String] = [] + + func output(_ result: String) -> String { + var result = result + if let prefix = prefix { + // Handles currency symbols and signs + // prefix="$", result="100" -> "$100" + // prefix="-", result="123" -> "-123" + result = prefix + result + } + value = nil + prefix = nil + return result + } + + for idx in 0.. value="1." -> "1.5" + v = v + current + value = v + continue + } else if let v = value { + results.append(output(v)) + } + prefix = hasPrefix ? String(current.first!) : prefix + value = f.isInteger ? f.floored().toString() : currentWithoutPrefix + } else { + fatalError("Converting the fraction failed") + } + } else if !self.words.contains(current) { + // Non-numeric words that aren't in our dictionary + // "hello", "world", "apple" + if let v = value { + value = v + results.append(output(v)) + } + results.append(output(current)) + } else if self.zeros.contains(current) { + // Handles zero words: "zero", "oh", "o" + // Especially important for sequences like "one oh one" -> "101" + value = (value ?? "") + "0" + } else if let ones = self.ones[current] { + // Handles single digits (1-9) and teens (11-19) + // "one" -> "1", "fifteen" -> "15" + if value == nil { + value = String(ones) + } else if let v = value, let prev = prev, self.ones[prev] != nil { + if ones < 10 { + // Handles consecutive digits + // "twenty one" -> "21" + value = String(v.dropLast()) + String(ones) + } else { + // Handles teen numbers after another number + // "one fifteen" -> "115" + value = v + String(ones) + } + } else if ones < 10 { + if let v = value, let f = Decimal(string: v), f.remainder(dividingBy: 10) == 0 { + // Handles single digit after tens + // "twenty one" -> "21" + value = (f + Decimal(ones)).integerPart() + } else { + value = value! + String(ones) + } + } else { + if let v = value, let f = Decimal(string: v), f.remainder(dividingBy: 100) == 0 { + // Handles teens after hundreds + // "one hundred fifteen" -> "115" + value = (f + Decimal(ones)).integerPart() + } else { + value = value! + String(ones) + } + } + } else if let (ones, suffix) = self.onesSuffixed[current] { + // Handles ordinal numbers and plurals + // "first" -> "1st", "second" -> "2nd", "thirds" -> "3rds" + if value == nil { + results.append(output("\(ones)\(suffix)")) + } else if let v = value, let prev = prev, self.ones[prev] != nil { + if ones < 10 { + // "twenty first" -> "21st" + results.append(output("\(v.dropLast())\(ones)\(suffix)")) + } else { + results.append(output("\(v)\(ones)\(suffix)")) + } + } else if ones < 10 { + if let v = value, let f = Decimal(string: v), f.remainder(dividingBy: 10) == 0 { + // "twenty first" -> "21st" + results.append(output("\((f + Decimal(ones)).integerPart())\(suffix)")) + } else { + results.append(output("\(value!)\(ones)\(suffix)")) + } + } else { + if let v = value, let f = Decimal(string: v), f.remainder(dividingBy: 100) == 0 { + results.append(output("\((f + Decimal(ones)).integerPart())\(suffix)")) + } else { + results.append(output("\(value!)\(ones)\(suffix)")) + } + } + value = nil + } else if let tens = self.tens[current] { + // Handles multiples of ten (20-90) + // "twenty" -> "20", "fifty" -> "50" + if value == nil { + value = String(tens) + } else if let v = value, !v.isEmpty { + value = v + String(tens) + } else { + if let v = value, let f = Decimal(string: v), f.remainder(dividingBy: 100) == 0 { + // Handles tens after hundreds + // "one hundred twenty" -> "120" + value = (f + Decimal(tens)).integerPart() + } else { + value = value! + String(tens) + } + } + } else if let (tens, suffix) = self.tensSuffixed[current] { + // Handles ordinal and plural forms of tens + // "twentieth" -> "20th", "sixties" -> "60s" + if value == nil { + results.append(output("\(tens)\(suffix)")) + } else if let v = value, !v.isEmpty, let f = Decimal(string: v) { + if f.remainder(dividingBy: 100) == 0 { + results.append(output("\((f + Decimal(tens)).integerPart())\(suffix)")) + } else { + results.append(output("\(value!)\(tens)\(suffix)")) + } + } else { + value = nil + } + } else if let multiplier = self.multipliers[current] { + // Handles number multipliers (hundred, thousand, million, etc.) + // "hundred" -> "100", "million" -> "1000000" + if value == nil { + value = String(multiplier) + } else if let v = value, let f = Decimal(string: v) { + let p = f * Decimal(multiplier) + if p.isInteger { + // "one hundred" -> "100" + value = p.integerPart() + } else { + value = v + results.append(output(v)) + } + } else if let v = value { + // Handles complex cases with multiple multipliers + // "one thousand two hundred" -> "1200" + let before = Decimal(string: v)! / 1000 * 1000 + let residual = Decimal(string: v)!.remainder(dividingBy: 1000) + value = "\(before + residual * Decimal(multiplier))" + } + } else if let (multiplier, suffix) = self.multipliersSuffixed[current] { + // Handles ordinal and plural forms of multipliers + // "hundredth" -> "100th", "thousands" -> "1000s" + if value == nil { + results.append(output("\(multiplier)\(suffix)")) + } else if let v = value, let f = Decimal(string: v), (f * Decimal(multiplier)).isInteger { + let p = f * Decimal(multiplier) + results.append(output("\(p.integerPart())\(suffix)")) + } else if var value { + let before = Decimal(string: value)! / 1000 * 1000 + let residual = Decimal(string: value)!.remainder(dividingBy: 1000) + value = "\(before + residual * Decimal(multiplier))" + results.append(output("\(value)\(suffix)")) + } + value = nil + } else if let prefixValue = self.precedingPrefixers[current] { + // Handles prefixes that come before numbers + // "minus 5" -> "-5", "plus 10" -> "+10" + if value != nil { + results.append(output(value!)) + } + if let next = next, self.words.contains(next) || nextIsNumeric { + prefix = prefixValue + } else { + results.append(output(current)) + } + } else if let prefixValue = self.followingPrefixers[current] { + // Handles currency words that come after numbers + // "5 dollars" -> "$5", "10 euros" -> "€10" + if value != nil { + prefix = prefixValue + results.append(output(value!)) + } else { + results.append(output(current)) + } + } else if let suffixValue = self.suffixers[current] { + // Handles various number suffixes + // "per cent" -> "%", "percent" -> "%" + if value != nil { + if let dictSuffixValue = suffixValue as? [String: String] { + if let n = next, let nextSuffix = dictSuffixValue[n] { + // Handles multi-word suffixes + // "5 per cent" -> "5%" + results.append(output("\(value!)\(nextSuffix)")) + skip = true + } else { + results.append(output(value!)) + results.append(output(current)) + } + } else { + results.append(output("\(value!)\(suffixValue)")) + } + } else { + results.append(output(current)) + } + } else if self.specials.contains(current) { + // Handles special cases: "and", "point", "double", "triple" + if let next, !self.words.contains(next) && !nextIsNumeric { + if let v = value { + results.append(output(v)) + } + results.append(output(current)) + } else if current == "and" { + // Handles "and" in number phrases + // "one hundred and twenty" -> "120" + if let prev, !self.multipliers.keys.contains(prev) { + if let v = value { + results.append(output(v)) + } + results.append(output(current)) + } + } else if current == "double" || current == "triple" { + // Handles repeated digits + // "double zero" -> "00", "triple five" -> "555" + if let next, let ones = self.ones[next] { + let repeats = current == "double" ? 2 : 3 + value = "\(value ?? "")\(ones)\(repeats)" + skip = true + } else { + if let v = value { + results.append(output(v)) + } + results.append(output(current)) + } + } else if current == "point" { + // Handles decimal points in numbers + // "one point five" -> "1.5" + if let next, self.decimals.contains(next) || nextIsNumeric { + value = "\(value ?? "")." + } + } else { + fatalError("Unexpected token: \(current)") + } + } else { + fatalError("Unexpected token: \(current)") + } + } + if let v = value { + results.append(output(v)) + } + return results + } + + func preprocess(_ s: String) -> String { + var results = [String]() + + let segments = s.split(separator: "and a half", omittingEmptySubsequences: false) + for (i, segment) in segments.enumerated() { + let trimmedSegment = segment.trimmingCharacters(in: .whitespaces) + if trimmedSegment.isEmpty { + continue + } + + if i == segments.count - 1 { + results.append(String(trimmedSegment)) + } else { + results.append(String(trimmedSegment)) + let lastWord = trimmedSegment.split(separator: " ").last ?? "" + if decimals.contains(String(lastWord)) || multipliers.keys.contains(String(lastWord)) { + results.append("point five") + } else { + results.append("and a half") + } + } + } + + var processedString = results.joined(separator: " ") + + // Put a space at number/letter boundary + processedString = processedString.replacingOccurrences(of: #"([a-z])([0-9])"#, with: "$1 $2", options: .regularExpression) + processedString = processedString.replacingOccurrences(of: #"([0-9])([a-z])"#, with: "$1 $2", options: .regularExpression) + // Remove spaces which could be a suffix + processedString = processedString.replacingOccurrences(of: #"([0-9])\s+(st|nd|rd|th|s)\b"#, with: "$1$2", options: .regularExpression) + + return processedString + } + + func postprocess(_ s: String) -> String { + func combineCents(match: NSTextCheckingResult, in string: String) -> String { + guard let currencyRange = Range(match.range(at: 1), in: string), + let integerRange = Range(match.range(at: 2), in: string), + let centsRange = Range(match.range(at: 3), in: string) + else { + return String(string) + } + let currency = String(string[currencyRange]) + let integer = String(string[integerRange]) + let cents = Int(String(string[centsRange])) ?? 0 + return "\(currency)\(integer).\(String(format: "%02d", cents))" + } + + func extractCents(match: NSTextCheckingResult, in string: String) -> String { + guard let centsRange = Range(match.range(at: 1), in: string) else { + return String(string) + } + let cents = Int(String(string[centsRange])) ?? 0 + return "¢\(cents)" + } + + var processedString = s + + // apply currency postprocessing; "$2 and ¢7" -> "$2.07" + do { + let regex1 = try NSRegularExpression(pattern: #"([€£$])([0-9]+) (?:and )?¢([0-9]{1,2})\b"#) + let matches1 = regex1.matches(in: processedString, range: NSRange(processedString.startIndex..., in: processedString)) + for match in matches1.reversed() { + let range = Range(match.range, in: processedString)! + let replacement = combineCents(match: match, in: processedString) + processedString.replaceSubrange(range, with: replacement) + } + } catch { + print("Error in regex: \(error)") + } + + do { + let regex2 = try NSRegularExpression(pattern: #"[€£$]0\\.([0-9]{1,2})\b"#) + let matches2 = regex2.matches(in: processedString, range: NSRange(processedString.startIndex..., in: processedString)) + for match in matches2.reversed() { + let range = Range(match.range, in: processedString)! + let replacement = extractCents(match: match, in: processedString) + processedString.replaceSubrange(range, with: replacement) + } + } catch { + print("Error in regex: \(error)") + } + + // write "one(s)" instead of "1(s)", just for readability + processedString = processedString.replacingOccurrences(of: #"\b1(s?)\b"#, with: "one$1", options: .regularExpression) + + return processedString + } + + func normalize(_ text: String) -> String { + var s = self.preprocess(text) + let out = self.processWords(s.components(separatedBy: " ").filter { $0 != "" }) + s = out.joined(separator: " ") + s = self.postprocess(s) + return s + } +} + +class EnglishSpellingNormalizer { + // + // Applies British-American spelling mappings as listed in [1]. + // [1] https://www.tysto.com/uk-us-spelling-list.html + + var mapping: [String: String] = [:] + + init(englishSpellingMapping: [String: String]) { + self.mapping = englishSpellingMapping + } + + func normalize(_ text: String) -> String { + let out = text.components(separatedBy: " ").map { self.mapping[$0] ?? $0 } + return out.joined(separator: " ") + } +} + +class EnglishTextNormalizer { + let numberNormalizer: EnglishNumberNormalizer + let spellingNormalizer: EnglishSpellingNormalizer + let ignorePatterns = #"\b(hmm|mm|mhm|mmm|uh|um)\b"# + let replacers: KeyValuePairs = [ + // common contractions + #"\bwon't\b"#: "will not", + #"\bcan't\b"#: "can not", + #"\blet's\b"#: "let us", + #"\bain't\b"#: "aint", + #"\by'all\b"#: "you all", + #"\bwanna\b"#: "want to", + #"\bgotta\b"#: "got to", + #"\bgonna\b"#: "going to", + #"\bi'ma\b"#: "i am going to", + #"\bimma\b"#: "i am going to", + #"\bwoulda\b"#: "would have", + #"\bcoulda\b"#: "could have", + #"\bshoulda\b"#: "should have", + #"\bma'am\b"#: "madam", + // contractions in titles/prefixes + #"\bmr\b"#: "mister ", + #"\bmrs\b"#: "missus ", + #"\bst\b"#: "saint ", + #"\bdr\b"#: "doctor ", + #"\bprof\b"#: "professor ", + #"\bcapt\b"#: "captain ", + #"\bgov\b"#: "governor ", + #"\bald\b"#: "alderman ", + #"\bgen\b"#: "general ", + #"\bsen\b"#: "senator ", + #"\brep\b"#: "representative ", + #"\bpres\b"#: "president ", + #"\brev\b"#: "reverend ", + #"\bhon\b"#: "honorable ", + #"\basst\b"#: "assistant ", + #"\bassoc\b"#: "associate ", + #"\blt\b"#: "lieutenant ", + #"\bcol\b"#: "colonel ", + #"\bjr\b"#: "junior ", + #"\bsr\b"#: "senior ", + #"\besq\b"#: "esquire ", + // prefect tenses, ideally it should be any past participles, but it's harder.. + #"'d been\b"#: " had been", + #"'s been\b"#: " has been", + #"'d gone\b"#: " had gone", + #"'s gone\b"#: " has gone", + #"'d done\b"#: " had done", // "'s done" is ambiguous + #"'s got\b"#: " has got", + // general contractions + #"n't\b"#: " not", + #"'re\b"#: " are", + #"'s\b"#: " is", + #"'d\b"#: " would", + #"'ll\b"#: " will", + #"'t\b"#: " not", + #"'ve\b"#: " have", + #"'m\b"#: " am", + ] + /// non-ASCII letters that are not separated by "NFKD" normalization + let ADDITIONAL_DIACRITICS = [ + "œ": "oe", + "Œ": "OE", + "ø": "o", + "Ø": "O", + "æ": "ae", + "Æ": "AE", + "ß": "ss", + "ẞ": "SS", + "đ": "d", + "Đ": "D", + "ð": "d", + "Ð": "D", + "þ": "th", + "Þ": "th", + "ł": "l", + "Ł": "L", + ] + + init() { + self.numberNormalizer = EnglishNumberNormalizer() + self.spellingNormalizer = EnglishSpellingNormalizer(englishSpellingMapping: englishSpellingMappingAbbr) + } + + func normalize(text: String) -> String { + var processedText = text + + // escape unicode from json + processedText = unescapeJSONUnicode(processedText) + + // lowercase + processedText = processedText.lowercased() + + // remove words between brackets + processedText.regReplace(pattern: #"[<\[][^>\]]*[>\]]"#, replaceWith: "") + // remove words between parenthesis + processedText.regReplace(pattern: #"\(([^)]+?)\)"#, replaceWith: "") + processedText.regReplace(pattern: self.ignorePatterns, replaceWith: "") + // standardize when there's a space before an apostrophe + processedText.regReplace(pattern: #"\s+'"#, replaceWith: "'") + + for (pattern, replacement) in self.replacers { + processedText.regReplace(pattern: pattern, replaceWith: replacement) + } + + // remove commas between digits + processedText.regReplace(pattern: #"(\d),(\d)"#, replaceWith: #"$1$2"#) + // remove periods not followed by numbers + processedText.regReplace(pattern: #"\.([^0-9]|$)"#, replaceWith: " $1") + // keep some symbols for numerics + processedText = self.removeSymbolsAndDiacritics(text: processedText, keep: ".%$¢€£") + processedText = self.numberNormalizer.normalize(processedText) + processedText = self.spellingNormalizer.normalize(processedText) + + // now remove prefix/suffix symbols that are not preceded/followed by numbers + processedText.regReplace(pattern: #"[.$¢€£]([^0-9])"#, replaceWith: #" $1"#) + processedText.regReplace(pattern: #"([^0-9])%"#, replaceWith: #"$1 "#) + // replace any successive whitespace characters with a space + processedText.regReplace(pattern: #"\s+"#, replaceWith: " ") + + return processedText + } + + func removeSymbolsAndDiacritics(text: String, keep: String = "") -> String { + // Replace any other markers, symbols, and punctuations with a space, and drop any diacritics + // (category 'Mn' and some manual mappings) + let keepSet = Set(keep) + let categoriesToReplaceWithSpace: [Unicode.GeneralCategory] = [ + .nonspacingMark, + .spacingMark, + .enclosingMark, + .mathSymbol, + .otherSymbol, + .currencySymbol, + .modifierSymbol, + .dashPunctuation, + .openPunctuation, + .closePunctuation, + .finalPunctuation, + .otherPunctuation, + .initialPunctuation, + .connectorPunctuation, + ] + + func replaceCharacter(char: Character) -> String { + if keepSet.contains(char) { + return String(char) + } else if self.ADDITIONAL_DIACRITICS.keys.contains(String(char)) { + return self.ADDITIONAL_DIACRITICS[String(char)]! + } else if unicodeCategoryFor(char: char) == Unicode.GeneralCategory.nonspacingMark { + return "" + } else if let category = unicodeCategoryFor(char: char), categoriesToReplaceWithSpace.contains(category) { + return " " + } + return String(char) + } + + func unicodeCategoryFor(char: Character) -> Unicode.GeneralCategory? { + guard let scalar = char.unicodeScalars.first else { return nil } + return scalar.properties.generalCategory + } + + if let normalizedString = text.applyingTransform(StringTransform(rawValue: "NFKD"), reverse: false) { + let out = normalizedString.map { replaceCharacter(char: $0) } + return out.joined(separator: "") + } + return text + } +} + +private extension String { + mutating func regReplace(pattern: String, replaceWith: String = "") { + do { + let regex = try NSRegularExpression(pattern: pattern, options: [.caseInsensitive, .anchorsMatchLines]) + let range = NSRange(self.startIndex..., in: self) + self = regex.stringByReplacingMatches(in: self, options: [], range: range, withTemplate: replaceWith) + } catch { return } + } +} + +private func unescapeJSONUnicode(_ text: String) -> String { + let regex = try! NSRegularExpression(pattern: "\\\\u([0-9A-Fa-f]{4})") + let nsString = text as NSString + let range = NSRange(location: 0, length: nsString.length) + var result = text + + regex.enumerateMatches(in: text, options: [], range: range) { match, _, _ in + if let match = match, let range = Range(match.range(at: 1), in: text) { + let hexString = String(text[range]) + if let unicodeScalar = UnicodeScalar(Int(hexString, radix: 16)!) { + let replacement = String(unicodeScalar) + result = result.replacingOccurrences(of: "\\u\(hexString)", with: replacement) + } + } + } + + return result +} + +private extension Double { + func isDenominatorCloseToOne(tolerance: Double = 1e-9) -> Bool { + let fractionalPart = self - floor(self) + return fractionalPart < tolerance || fractionalPart > (1 - tolerance) + } +} + +private extension Decimal { + var isInteger: Bool { + return self == self.floored() + } + + func floored() -> Decimal { + let nsDecimalNumber = NSDecimalNumber(decimal: self) + let flooredNumber = nsDecimalNumber.rounding( + accordingToBehavior: NSDecimalNumberHandler( + roundingMode: .down, + scale: 0, + raiseOnExactness: false, + raiseOnOverflow: false, + raiseOnUnderflow: false, + raiseOnDivideByZero: false + ) + ) + return flooredNumber.decimalValue + } + + func toString() -> String { + return "\(self)" + } + + func integerPart() -> String { + return String(self.toString().split(separator: ".").first ?? "0") + } + + func remainder(dividingBy divisor: Decimal) -> Decimal { + let decimalNumber = NSDecimalNumber(decimal: self) + let divisorNumber = NSDecimalNumber(decimal: divisor) + + let quotient = decimalNumber.dividing(by: divisorNumber, withBehavior: nil) + let roundedQuotient = quotient.rounding(accordingToBehavior: NSDecimalNumberHandler(roundingMode: .down, scale: 0, raiseOnExactness: false, raiseOnOverflow: false, raiseOnUnderflow: false, raiseOnDivideByZero: false)) + + let product = roundedQuotient.multiplying(by: divisorNumber) + let remainder = decimalNumber.subtracting(product) + + return remainder.decimalValue + } +} diff --git a/Tests/WhisperKitTests/Evaluate/SpellingMapping.swift b/Tests/WhisperKitTests/Evaluate/SpellingMapping.swift new file mode 100644 index 0000000..5a06713 --- /dev/null +++ b/Tests/WhisperKitTests/Evaluate/SpellingMapping.swift @@ -0,0 +1,1746 @@ +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + +/// https://github.com/argmaxinc/whisperkittools/blob/main/whisperkit/evaluate/abbreviations_en.py See abbr +let englishSpellingMappingAbbr = [ + "accessorise": "accessorize", + "accessorised": "accessorized", + "accessorises": "accessorizes", + "accessorising": "accessorizing", + "acclimatisation": "acclimatization", + "acclimatise": "acclimatize", + "acclimatised": "acclimatized", + "acclimatises": "acclimatizes", + "acclimatising": "acclimatizing", + "accoutrements": "accouterments", + "aeon": "eon", + "aeons": "eons", + "aerogramme": "aerogram", + "aerogrammes": "aerograms", + "aeroplane": "airplane", + "aeroplanes": "airplanes", + "aesthete": "esthete", + "aesthetes": "esthetes", + "aesthetic": "esthetic", + "aesthetically": "esthetically", + "aesthetics": "esthetics", + "aetiology": "etiology", + "ageing": "aging", + "aggrandisement": "aggrandizement", + "agonise": "agonize", + "agonised": "agonized", + "agonises": "agonizes", + "agonising": "agonizing", + "agonisingly": "agonizingly", + "almanack": "almanac", + "almanacks": "almanacs", + "aluminium": "aluminum", + "amortisable": "amortizable", + "amortisation": "amortization", + "amortisations": "amortizations", + "amortise": "amortize", + "amortised": "amortized", + "amortises": "amortizes", + "amortising": "amortizing", + "amphitheatre": "amphitheater", + "amphitheatres": "amphitheaters", + "anaemia": "anemia", + "anaemic": "anemic", + "anaesthesia": "anesthesia", + "anaesthetic": "anesthetic", + "anaesthetics": "anesthetics", + "anaesthetise": "anesthetize", + "anaesthetised": "anesthetized", + "anaesthetises": "anesthetizes", + "anaesthetising": "anesthetizing", + "anaesthetist": "anesthetist", + "anaesthetists": "anesthetists", + "anaesthetize": "anesthetize", + "anaesthetized": "anesthetized", + "anaesthetizes": "anesthetizes", + "anaesthetizing": "anesthetizing", + "analogue": "analog", + "analogues": "analogs", + "analyse": "analyze", + "analysed": "analyzed", + "analyses": "analyzes", + "analysing": "analyzing", + "anglicise": "anglicize", + "anglicised": "anglicized", + "anglicises": "anglicizes", + "anglicising": "anglicizing", + "annualised": "annualized", + "antagonise": "antagonize", + "antagonised": "antagonized", + "antagonises": "antagonizes", + "antagonising": "antagonizing", + "apologise": "apologize", + "apologised": "apologized", + "apologises": "apologizes", + "apologising": "apologizing", + "appal": "appall", + "appals": "appalls", + "appetiser": "appetizer", + "appetisers": "appetizers", + "appetising": "appetizing", + "appetisingly": "appetizingly", + "arbour": "arbor", + "arbours": "arbors", + "archaeologically": "archeologically", + "archaeologist": "archeologist", + "archaeologists": "archeologists", + "archaeology": "archeology", + "archeological": "archaeological", + "ardour": "ardor", + "armour": "armor", + "armoured": "armored", + "armourer": "armorer", + "armourers": "armorers", + "armouries": "armories", + "armoury": "armory", + "artefact": "artifact", + "artefacts": "artifacts", + "authorise": "authorize", + "authorised": "authorized", + "authorises": "authorizes", + "authorising": "authorizing", + "axe": "ax", + "backpedalled": "backpedaled", + "backpedalling": "backpedaling", + "bannister": "banister", + "bannisters": "banisters", + "baptise": "baptize", + "baptised": "baptized", + "baptises": "baptizes", + "baptising": "baptizing", + "bastardise": "bastardize", + "bastardised": "bastardized", + "bastardises": "bastardizes", + "bastardising": "bastardizing", + "battleax": "battleaxe", + "baulk": "balk", + "baulked": "balked", + "baulking": "balking", + "baulks": "balks", + "bedevilled": "bedeviled", + "bedevilling": "bedeviling", + "behaviour": "behavior", + "behavioural": "behavioral", + "behaviourism": "behaviorism", + "behaviourist": "behaviorist", + "behaviourists": "behaviorists", + "behaviours": "behaviors", + "behove": "behoove", + "behoved": "behooved", + "behoves": "behooves", + "bejewelled": "bejeweled", + "belabour": "belabor", + "belaboured": "belabored", + "belabouring": "belaboring", + "belabours": "belabors", + "bevelled": "beveled", + "bevvies": "bevies", + "bevvy": "bevy", + "biassed": "biased", + "biassing": "biasing", + "bingeing": "binging", + "bougainvillaea": "bougainvillea", + "bougainvillaeas": "bougainvilleas", + "bowdlerise": "bowdlerize", + "bowdlerised": "bowdlerized", + "bowdlerises": "bowdlerizes", + "bowdlerising": "bowdlerizing", + "breathalyse": "breathalyze", + "breathalysed": "breathalyzed", + "breathalyser": "breathalyzer", + "breathalysers": "breathalyzers", + "breathalyses": "breathalyzes", + "breathalysing": "breathalyzing", + "brutalise": "brutalize", + "brutalised": "brutalized", + "brutalises": "brutalizes", + "brutalising": "brutalizing", + "busses": "buses", + "bussing": "busing", + "caesarean": "cesarean", + "caesareans": "cesareans", + "calibre": "caliber", + "calibres": "calibers", + "calliper": "caliper", + "callipers": "calipers", + "callisthenics": "calisthenics", + "canalise": "canalize", + "canalised": "canalized", + "canalises": "canalizes", + "canalising": "canalizing", + "cancelation": "cancellation", + "cancelations": "cancellations", + "cancelled": "canceled", + "cancelling": "canceling", + "candour": "candor", + "cannibalise": "cannibalize", + "cannibalised": "cannibalized", + "cannibalises": "cannibalizes", + "cannibalising": "cannibalizing", + "canonise": "canonize", + "canonised": "canonized", + "canonises": "canonizes", + "canonising": "canonizing", + "capitalise": "capitalize", + "capitalised": "capitalized", + "capitalises": "capitalizes", + "capitalising": "capitalizing", + "caramelise": "caramelize", + "caramelised": "caramelized", + "caramelises": "caramelizes", + "caramelising": "caramelizing", + "carbonise": "carbonize", + "carbonised": "carbonized", + "carbonises": "carbonizes", + "carbonising": "carbonizing", + "carolled": "caroled", + "carolling": "caroling", + "catalogue": "catalog", + "catalogued": "cataloged", + "catalogues": "catalogs", + "cataloguing": "cataloging", + "catalyse": "catalyze", + "catalysed": "catalyzed", + "catalyses": "catalyzes", + "catalysing": "catalyzing", + "categorise": "categorize", + "categorised": "categorized", + "categorises": "categorizes", + "categorising": "categorizing", + "cauterise": "cauterize", + "cauterised": "cauterized", + "cauterises": "cauterizes", + "cauterising": "cauterizing", + "cavilled": "caviled", + "cavilling": "caviling", + "centigramme": "centigram", + "centigrammes": "centigrams", + "centilitre": "centiliter", + "centilitres": "centiliters", + "centimetre": "centimeter", + "centimetres": "centimeters", + "centralise": "centralize", + "centralised": "centralized", + "centralises": "centralizes", + "centralising": "centralizing", + "centre": "center", + "centred": "centered", + "centrefold": "centerfold", + "centrefolds": "centerfolds", + "centrepiece": "centerpiece", + "centrepieces": "centerpieces", + "centres": "centers", + "channelled": "channeled", + "channelling": "channeling", + "characterise": "characterize", + "characterised": "characterized", + "characterises": "characterizes", + "characterising": "characterizing", + "cheque": "check", + "chequebook": "checkbook", + "chequebooks": "checkbooks", + "chequered": "checkered", + "cheques": "checks", + "chilli": "chili", + "chimaera": "chimera", + "chimaeras": "chimeras", + "chiselled": "chiseled", + "chiselling": "chiseling", + "circularise": "circularize", + "circularised": "circularized", + "circularises": "circularizes", + "circularising": "circularizing", + "civilise": "civilize", + "civilised": "civilized", + "civilises": "civilizes", + "civilising": "civilizing", + "clamour": "clamor", + "clamoured": "clamored", + "clamouring": "clamoring", + "clamours": "clamors", + "clangour": "clangor", + "clarinettist": "clarinetist", + "clarinettists": "clarinetists", + "collectivise": "collectivize", + "collectivised": "collectivized", + "collectivises": "collectivizes", + "collectivising": "collectivizing", + "colonisation": "colonization", + "colonise": "colonize", + "colonised": "colonized", + "coloniser": "colonizer", + "colonisers": "colonizers", + "colonises": "colonizes", + "colonising": "colonizing", + "colour": "color", + "colourant": "colorant", + "colourants": "colorants", + "coloured": "colored", + "coloureds": "coloreds", + "colourful": "colorful", + "colourfully": "colorfully", + "colouring": "coloring", + "colourize": "colorize", + "colourized": "colorized", + "colourizes": "colorizes", + "colourizing": "colorizing", + "colourless": "colorless", + "colours": "colors", + "commercialise": "commercialize", + "commercialised": "commercialized", + "commercialises": "commercializes", + "commercialising": "commercializing", + "compartmentalise": "compartmentalize", + "compartmentalised": "compartmentalized", + "compartmentalises": "compartmentalizes", + "compartmentalising": "compartmentalizing", + "computerise": "computerize", + "computerised": "computerized", + "computerises": "computerizes", + "computerising": "computerizing", + "conceptualise": "conceptualize", + "conceptualised": "conceptualized", + "conceptualises": "conceptualizes", + "conceptualising": "conceptualizing", + "connexion": "connection", + "connexions": "connections", + "contextualise": "contextualize", + "contextualised": "contextualized", + "contextualises": "contextualizes", + "contextualising": "contextualizing", + "cosier": "cozier", + "cosies": "cozies", + "cosiest": "coziest", + "cosily": "cozily", + "cosiness": "coziness", + "cosy": "cozy", + "councillor": "councilor", + "councillors": "councilors", + "counselled": "counseled", + "counselling": "counseling", + "counsellor": "counselor", + "counsellors": "counselors", + "crenelated": "crenellated", + "criminalise": "criminalize", + "criminalised": "criminalized", + "criminalises": "criminalizes", + "criminalising": "criminalizing", + "criticise": "criticize", + "criticised": "criticized", + "criticises": "criticizes", + "criticising": "criticizing", + "crueller": "crueler", + "cruellest": "cruelest", + "crystallisation": "crystallization", + "crystallise": "crystallize", + "crystallised": "crystallized", + "crystallises": "crystallizes", + "crystallising": "crystallizing", + "cudgelled": "cudgeled", + "cudgelling": "cudgeling", + "customise": "customize", + "customised": "customized", + "customises": "customizes", + "customising": "customizing", + "cypher": "cipher", + "cyphers": "ciphers", + "decentralisation": "decentralization", + "decentralise": "decentralize", + "decentralised": "decentralized", + "decentralises": "decentralizes", + "decentralising": "decentralizing", + "decriminalisation": "decriminalization", + "decriminalise": "decriminalize", + "decriminalised": "decriminalized", + "decriminalises": "decriminalizes", + "decriminalising": "decriminalizing", + "defence": "defense", + "defenceless": "defenseless", + "defences": "defenses", + "dehumanisation": "dehumanization", + "dehumanise": "dehumanize", + "dehumanised": "dehumanized", + "dehumanises": "dehumanizes", + "dehumanising": "dehumanizing", + "demeanour": "demeanor", + "demilitarisation": "demilitarization", + "demilitarise": "demilitarize", + "demilitarised": "demilitarized", + "demilitarises": "demilitarizes", + "demilitarising": "demilitarizing", + "demobilisation": "demobilization", + "demobilise": "demobilize", + "demobilised": "demobilized", + "demobilises": "demobilizes", + "demobilising": "demobilizing", + "democratisation": "democratization", + "democratise": "democratize", + "democratised": "democratized", + "democratises": "democratizes", + "democratising": "democratizing", + "demonise": "demonize", + "demonised": "demonized", + "demonises": "demonizes", + "demonising": "demonizing", + "demoralisation": "demoralization", + "demoralise": "demoralize", + "demoralised": "demoralized", + "demoralises": "demoralizes", + "demoralising": "demoralizing", + "denationalisation": "denationalization", + "denationalise": "denationalize", + "denationalised": "denationalized", + "denationalises": "denationalizes", + "denationalising": "denationalizing", + "deodorise": "deodorize", + "deodorised": "deodorized", + "deodorises": "deodorizes", + "deodorising": "deodorizing", + "depersonalise": "depersonalize", + "depersonalised": "depersonalized", + "depersonalises": "depersonalizes", + "depersonalising": "depersonalizing", + "deputise": "deputize", + "deputised": "deputized", + "deputises": "deputizes", + "deputising": "deputizing", + "desensitisation": "desensitization", + "desensitise": "desensitize", + "desensitised": "desensitized", + "desensitises": "desensitizes", + "desensitising": "desensitizing", + "destabilisation": "destabilization", + "destabilise": "destabilize", + "destabilised": "destabilized", + "destabilises": "destabilizes", + "destabilising": "destabilizing", + "dialled": "dialed", + "dialling": "dialing", + "dialogue": "dialog", + "dialogues": "dialogs", + "diarrhoea": "diarrhea", + "digitise": "digitize", + "digitised": "digitized", + "digitises": "digitizes", + "digitising": "digitizing", + "disc": "disk", + "discolour": "discolor", + "discoloured": "discolored", + "discolouring": "discoloring", + "discolours": "discolors", + "discs": "disks", + "disembowelled": "disemboweled", + "disembowelling": "disemboweling", + "disfavour": "disfavor", + "dishevelled": "disheveled", + "dishonour": "dishonor", + "dishonourable": "dishonorable", + "dishonourably": "dishonorably", + "dishonoured": "dishonored", + "dishonouring": "dishonoring", + "dishonours": "dishonors", + "disorganisation": "disorganization", + "disorganised": "disorganized", + "distil": "distill", + "distils": "distills", + "dramatisation": "dramatization", + "dramatisations": "dramatizations", + "dramatise": "dramatize", + "dramatised": "dramatized", + "dramatises": "dramatizes", + "dramatising": "dramatizing", + "draught": "draft", + "draughtboard": "draftboard", + "draughtboards": "draftboards", + "draughtier": "draftier", + "draughtiest": "draftiest", + "draughts": "drafts", + "draughtsman": "draftsman", + "draughtsmanship": "draftsmanship", + "draughtsmen": "draftsmen", + "draughtswoman": "draftswoman", + "draughtswomen": "draftswomen", + "draughty": "drafty", + "drivelled": "driveled", + "drivelling": "driveling", + "duelled": "dueled", + "duelling": "dueling", + "economise": "economize", + "economised": "economized", + "economises": "economizes", + "economising": "economizing", + "editorialise": "editorialize", + "editorialised": "editorialized", + "editorialises": "editorializes", + "editorialising": "editorializing", + "edoema": "edema", + "empathise": "empathize", + "empathised": "empathized", + "empathises": "empathizes", + "empathising": "empathizing", + "emphasise": "emphasize", + "emphasised": "emphasized", + "emphasises": "emphasizes", + "emphasising": "emphasizing", + "enamelled": "enameled", + "enamelling": "enameling", + "enamoured": "enamored", + "encyclopaedia": "encyclopedia", + "encyclopaedias": "encyclopedias", + "encyclopaedic": "encyclopedic", + "endeavour": "endeavor", + "endeavoured": "endeavored", + "endeavouring": "endeavoring", + "endeavours": "endeavors", + "energise": "energize", + "energised": "energized", + "energises": "energizes", + "energising": "energizing", + "enrol": "enroll", + "enrols": "enrolls", + "enthral": "enthrall", + "enthrals": "enthralls", + "epaulette": "epaulet", + "epaulettes": "epaulets", + "epicentre": "epicenter", + "epicentres": "epicenters", + "epilogue": "epilog", + "epilogues": "epilogs", + "epitomise": "epitomize", + "epitomised": "epitomized", + "epitomises": "epitomizes", + "epitomising": "epitomizing", + "equalisation": "equalization", + "equalise": "equalize", + "equalised": "equalized", + "equaliser": "equalizer", + "equalisers": "equalizers", + "equalises": "equalizes", + "equalising": "equalizing", + "eulogise": "eulogize", + "eulogised": "eulogized", + "eulogises": "eulogizes", + "eulogising": "eulogizing", + "evangelise": "evangelize", + "evangelised": "evangelized", + "evangelises": "evangelizes", + "evangelising": "evangelizing", + "exorcise": "exorcize", + "exorcised": "exorcized", + "exorcises": "exorcizes", + "exorcising": "exorcizing", + "extemporisation": "extemporization", + "extemporise": "extemporize", + "extemporised": "extemporized", + "extemporises": "extemporizes", + "extemporising": "extemporizing", + "externalisation": "externalization", + "externalisations": "externalizations", + "externalise": "externalize", + "externalised": "externalized", + "externalises": "externalizes", + "externalising": "externalizing", + "factorise": "factorize", + "factorised": "factorized", + "factorises": "factorizes", + "factorising": "factorizing", + "faecal": "fecal", + "faeces": "feces", + "familiarisation": "familiarization", + "familiarise": "familiarize", + "familiarised": "familiarized", + "familiarises": "familiarizes", + "familiarising": "familiarizing", + "fantasise": "fantasize", + "fantasised": "fantasized", + "fantasises": "fantasizes", + "fantasising": "fantasizing", + "favour": "favor", + "favourable": "favorable", + "favourably": "favorably", + "favoured": "favored", + "favouring": "favoring", + "favourite": "favorite", + "favourites": "favorites", + "favouritism": "favoritism", + "favours": "favors", + "feminise": "feminize", + "feminised": "feminized", + "feminises": "feminizes", + "feminising": "feminizing", + "fertilisation": "fertilization", + "fertilise": "fertilize", + "fertilised": "fertilized", + "fertiliser": "fertilizer", + "fertilisers": "fertilizers", + "fertilises": "fertilizes", + "fertilising": "fertilizing", + "fervour": "fervor", + "fibre": "fiber", + "fibreglass": "fiberglass", + "fibres": "fibers", + "fictionalisation": "fictionalization", + "fictionalisations": "fictionalizations", + "fictionalise": "fictionalize", + "fictionalised": "fictionalized", + "fictionalises": "fictionalizes", + "fictionalising": "fictionalizing", + "fillet": "filet", + "filleted": "fileted", + "filleting": "fileting", + "fillets": "filets", + "finalisation": "finalization", + "finalise": "finalize", + "finalised": "finalized", + "finalises": "finalizes", + "finalising": "finalizing", + "flautist": "flutist", + "flautists": "flutists", + "flavour": "flavor", + "flavoured": "flavored", + "flavouring": "flavoring", + "flavourings": "flavorings", + "flavourless": "flavorless", + "flavours": "flavors", + "flavoursome": "flavorsome", + "flyer / flier": "flier / flyer", + "foetal": "fetal", + "foetid": "fetid", + "foetus": "fetus", + "foetuses": "fetuses", + "formalisation": "formalization", + "formalise": "formalize", + "formalised": "formalized", + "formalises": "formalizes", + "formalising": "formalizing", + "fossilisation": "fossilization", + "fossilise": "fossilize", + "fossilised": "fossilized", + "fossilises": "fossilizes", + "fossilising": "fossilizing", + "fraternisation": "fraternization", + "fraternise": "fraternize", + "fraternised": "fraternized", + "fraternises": "fraternizes", + "fraternising": "fraternizing", + "fulfil": "fulfill", + "fulfilment": "fulfillment", + "fulfils": "fulfills", + "funnelled": "funneled", + "funnelling": "funneling", + "gage": "gauge", + "gaged": "gauged", + "gages": "gauges", + "gaging": "gauging", + "galvanise": "galvanize", + "galvanised": "galvanized", + "galvanises": "galvanizes", + "galvanising": "galvanizing", + "gambolled": "gamboled", + "gambolling": "gamboling", + "gaol": "jail", + "gaolbird": "jailbird", + "gaolbirds": "jailbirds", + "gaolbreak": "jailbreak", + "gaolbreaks": "jailbreaks", + "gaoled": "jailed", + "gaoler": "jailer", + "gaolers": "jailers", + "gaoling": "jailing", + "gaols": "jails", + "gasses": "gases", + "generalisation": "generalization", + "generalisations": "generalizations", + "generalise": "generalize", + "generalised": "generalized", + "generalises": "generalizes", + "generalising": "generalizing", + "ghettoise": "ghettoize", + "ghettoised": "ghettoized", + "ghettoises": "ghettoizes", + "ghettoising": "ghettoizing", + "gipsies": "gypsies", + "glamor": "glamour", + "glamorise": "glamorize", + "glamorised": "glamorized", + "glamorises": "glamorizes", + "glamorising": "glamorizing", + "globalisation": "globalization", + "globalise": "globalize", + "globalised": "globalized", + "globalises": "globalizes", + "globalising": "globalizing", + "glueing": "gluing", + "goitre": "goiter", + "goitres": "goiters", + "gonorrhoea": "gonorrhea", + "gramme": "gram", + "grammes": "grams", + "gravelled": "graveled", + "grey": "gray", + "greyed": "grayed", + "greying": "graying", + "greyish": "grayish", + "greyness": "grayness", + "greys": "grays", + "grovelled": "groveled", + "grovelling": "groveling", + "groyne": "groin", + "groynes": "groins", + "gruelling": "grueling", + "gruellingly": "gruelingly", + "gryphon": "griffin", + "gryphons": "griffins", + "gynaecological": "gynecological", + "gynaecologist": "gynecologist", + "gynaecologists": "gynecologists", + "gynaecology": "gynecology", + "haematological": "hematological", + "haematologist": "hematologist", + "haematologists": "hematologists", + "haematology": "hematology", + "haemoglobin": "hemoglobin", + "haemophilia": "hemophilia", + "haemophiliac": "hemophiliac", + "haemophiliacs": "hemophiliacs", + "haemorrhage": "hemorrhage", + "haemorrhaged": "hemorrhaged", + "haemorrhages": "hemorrhages", + "haemorrhaging": "hemorrhaging", + "haemorrhoids": "hemorrhoids", + "harbour": "harbor", + "harboured": "harbored", + "harbouring": "harboring", + "harbours": "harbors", + "harmonisation": "harmonization", + "harmonise": "harmonize", + "harmonised": "harmonized", + "harmonises": "harmonizes", + "harmonising": "harmonizing", + "homoeopath": "homeopath", + "homoeopathic": "homeopathic", + "homoeopaths": "homeopaths", + "homoeopathy": "homeopathy", + "homogenise": "homogenize", + "homogenised": "homogenized", + "homogenises": "homogenizes", + "homogenising": "homogenizing", + "honour": "honor", + "honourable": "honorable", + "honourably": "honorably", + "honoured": "honored", + "honouring": "honoring", + "honours": "honors", + "hospitalisation": "hospitalization", + "hospitalise": "hospitalize", + "hospitalised": "hospitalized", + "hospitalises": "hospitalizes", + "hospitalising": "hospitalizing", + "humanise": "humanize", + "humanised": "humanized", + "humanises": "humanizes", + "humanising": "humanizing", + "humour": "humor", + "humoured": "humored", + "humouring": "humoring", + "humourless": "humorless", + "humours": "humors", + "hybridise": "hybridize", + "hybridised": "hybridized", + "hybridises": "hybridizes", + "hybridising": "hybridizing", + "hypnotise": "hypnotize", + "hypnotised": "hypnotized", + "hypnotises": "hypnotizes", + "hypnotising": "hypnotizing", + "hypothesise": "hypothesize", + "hypothesised": "hypothesized", + "hypothesises": "hypothesizes", + "hypothesising": "hypothesizing", + "idealisation": "idealization", + "idealise": "idealize", + "idealised": "idealized", + "idealises": "idealizes", + "idealising": "idealizing", + "idolise": "idolize", + "idolised": "idolized", + "idolises": "idolizes", + "idolising": "idolizing", + "immobilisation": "immobilization", + "immobilise": "immobilize", + "immobilised": "immobilized", + "immobiliser": "immobilizer", + "immobilisers": "immobilizers", + "immobilises": "immobilizes", + "immobilising": "immobilizing", + "immortalise": "immortalize", + "immortalised": "immortalized", + "immortalises": "immortalizes", + "immortalising": "immortalizing", + "immunisation": "immunization", + "immunise": "immunize", + "immunised": "immunized", + "immunises": "immunizes", + "immunising": "immunizing", + "impanelled": "impaneled", + "impanelling": "impaneling", + "imperilled": "imperiled", + "imperilling": "imperiling", + "individualise": "individualize", + "individualised": "individualized", + "individualises": "individualizes", + "individualising": "individualizing", + "industrialise": "industrialize", + "industrialised": "industrialized", + "industrialises": "industrializes", + "industrialising": "industrializing", + "inflexion": "inflection", + "inflexions": "inflections", + "initialise": "initialize", + "initialised": "initialized", + "initialises": "initializes", + "initialising": "initializing", + "initialled": "initialed", + "initialling": "initialing", + "instal": "install", + "instalment": "installment", + "instalments": "installments", + "instals": "installs", + "instil": "instill", + "instils": "instills", + "institutionalisation": "institutionalization", + "institutionalise": "institutionalize", + "institutionalised": "institutionalized", + "institutionalises": "institutionalizes", + "institutionalising": "institutionalizing", + "intellectualise": "intellectualize", + "intellectualised": "intellectualized", + "intellectualises": "intellectualizes", + "intellectualising": "intellectualizing", + "internalisation": "internalization", + "internalise": "internalize", + "internalised": "internalized", + "internalises": "internalizes", + "internalising": "internalizing", + "internationalisation": "internationalization", + "internationalise": "internationalize", + "internationalised": "internationalized", + "internationalises": "internationalizes", + "internationalising": "internationalizing", + "ionisation": "ionization", + "ionise": "ionize", + "ionised": "ionized", + "ioniser": "ionizer", + "ionisers": "ionizers", + "ionises": "ionizes", + "ionising": "ionizing", + "italicise": "italicize", + "italicised": "italicized", + "italicises": "italicizes", + "italicising": "italicizing", + "itemise": "itemize", + "itemised": "itemized", + "itemises": "itemizes", + "itemising": "itemizing", + "jeopardise": "jeopardize", + "jeopardised": "jeopardized", + "jeopardises": "jeopardizes", + "jeopardising": "jeopardizing", + "jewelled": "jeweled", + "jeweller": "jeweler", + "jewellers": "jewelers", + "jewellery": "jewelry", + "judgement": "judgment", + "kilogramme": "kilogram", + "kilogrammes": "kilograms", + "kilometre": "kilometer", + "kilometres": "kilometers", + "labelled": "labeled", + "labelling": "labeling", + "labour": "labor", + "laboured": "labored", + "labourer": "laborer", + "labourers": "laborers", + "labouring": "laboring", + "labours": "labors", + "lacklustre": "lackluster", + "legalisation": "legalization", + "legalise": "legalize", + "legalised": "legalized", + "legalises": "legalizes", + "legalising": "legalizing", + "legitimise": "legitimize", + "legitimised": "legitimized", + "legitimises": "legitimizes", + "legitimising": "legitimizing", + "leukaemia": "leukemia", + "levelled": "leveled", + "leveller": "leveler", + "levellers": "levelers", + "levelling": "leveling", + "libelled": "libeled", + "libelling": "libeling", + "libellous": "libelous", + "liberalisation": "liberalization", + "liberalise": "liberalize", + "liberalised": "liberalized", + "liberalises": "liberalizes", + "liberalising": "liberalizing", + "licence": "license", + "licenced": "licensed", + "licences": "licenses", + "licencing": "licensing", + "likeable": "likable", + "lionisation": "lionization", + "lionise": "lionize", + "lionised": "lionized", + "lionises": "lionizes", + "lionising": "lionizing", + "liquidise": "liquidize", + "liquidised": "liquidized", + "liquidiser": "liquidizer", + "liquidisers": "liquidizers", + "liquidises": "liquidizes", + "liquidising": "liquidizing", + "litre": "liter", + "litres": "liters", + "localise": "localize", + "localised": "localized", + "localises": "localizes", + "localising": "localizing", + "louvre": "louver", + "louvred": "louvered", + "louvres": "louvers", + "lustre": "luster", + "magnetise": "magnetize", + "magnetised": "magnetized", + "magnetises": "magnetizes", + "magnetising": "magnetizing", + "manoeuvrability": "maneuverability", + "manoeuvrable": "maneuverable", + "manoeuvre": "maneuver", + "manoeuvred": "maneuvered", + "manoeuvres": "maneuvers", + "manoeuvring": "maneuvering", + "manoeuvrings": "maneuverings", + "marginalisation": "marginalization", + "marginalise": "marginalize", + "marginalised": "marginalized", + "marginalises": "marginalizes", + "marginalising": "marginalizing", + "marshalled": "marshaled", + "marshalling": "marshaling", + "marvelled": "marveled", + "marvelling": "marveling", + "marvellous": "marvelous", + "marvellously": "marvelously", + "materialisation": "materialization", + "materialise": "materialize", + "materialised": "materialized", + "materialises": "materializes", + "materialising": "materializing", + "maximisation": "maximization", + "maximise": "maximize", + "maximised": "maximized", + "maximises": "maximizes", + "maximising": "maximizing", + "meagre": "meager", + "mechanisation": "mechanization", + "mechanise": "mechanize", + "mechanised": "mechanized", + "mechanises": "mechanizes", + "mechanising": "mechanizing", + "mediaeval": "medieval", + "memorialise": "memorialize", + "memorialised": "memorialized", + "memorialises": "memorializes", + "memorialising": "memorializing", + "memorise": "memorize", + "memorised": "memorized", + "memorises": "memorizes", + "memorising": "memorizing", + "mesmerise": "mesmerize", + "mesmerised": "mesmerized", + "mesmerises": "mesmerizes", + "mesmerising": "mesmerizing", + "metabolise": "metabolize", + "metabolised": "metabolized", + "metabolises": "metabolizes", + "metabolising": "metabolizing", + "metre": "meter", + "metres": "meters", + "mhm": "hmm", + "micrometre": "micrometer", + "micrometres": "micrometers", + "militarise": "militarize", + "militarised": "militarized", + "militarises": "militarizes", + "militarising": "militarizing", + "milligramme": "milligram", + "milligrammes": "milligrams", + "millilitre": "milliliter", + "millilitres": "milliliters", + "millimetre": "millimeter", + "millimetres": "millimeters", + "miniaturisation": "miniaturization", + "miniaturise": "miniaturize", + "miniaturised": "miniaturized", + "miniaturises": "miniaturizes", + "miniaturising": "miniaturizing", + "minibusses": "minibuses", + "minimise": "minimize", + "minimised": "minimized", + "minimises": "minimizes", + "minimising": "minimizing", + "misbehaviour": "misbehavior", + "misdemeanour": "misdemeanor", + "misdemeanours": "misdemeanors", + "misspelt": "misspelled", + "mitre": "miter", + "mitres": "miters", + "mm": "hmm", + "mmm": "hmm", + "mobilisation": "mobilization", + "mobilise": "mobilize", + "mobilised": "mobilized", + "mobilises": "mobilizes", + "mobilising": "mobilizing", + "modelled": "modeled", + "modeller": "modeler", + "modellers": "modelers", + "modelling": "modeling", + "modernise": "modernize", + "modernised": "modernized", + "modernises": "modernizes", + "modernising": "modernizing", + "moisturise": "moisturize", + "moisturised": "moisturized", + "moisturiser": "moisturizer", + "moisturisers": "moisturizers", + "moisturises": "moisturizes", + "moisturising": "moisturizing", + "monologue": "monolog", + "monologues": "monologs", + "monopolisation": "monopolization", + "monopolise": "monopolize", + "monopolised": "monopolized", + "monopolises": "monopolizes", + "monopolising": "monopolizing", + "moralise": "moralize", + "moralised": "moralized", + "moralises": "moralizes", + "moralising": "moralizing", + "motorised": "motorized", + "mould": "mold", + "moulded": "molded", + "moulder": "molder", + "mouldered": "moldered", + "mouldering": "moldering", + "moulders": "molders", + "mouldier": "moldier", + "mouldiest": "moldiest", + "moulding": "molding", + "mouldings": "moldings", + "moulds": "molds", + "mouldy": "moldy", + "moult": "molt", + "moulted": "molted", + "moulting": "molting", + "moults": "molts", + "moustache": "mustache", + "moustached": "mustached", + "moustaches": "mustaches", + "moustachioed": "mustachioed", + "multicoloured": "multicolored", + "nationalisation": "nationalization", + "nationalisations": "nationalizations", + "nationalise": "nationalize", + "nationalised": "nationalized", + "nationalises": "nationalizes", + "nationalising": "nationalizing", + "naturalisation": "naturalization", + "naturalise": "naturalize", + "naturalised": "naturalized", + "naturalises": "naturalizes", + "naturalising": "naturalizing", + "neighbour": "neighbor", + "neighbourhood": "neighborhood", + "neighbourhoods": "neighborhoods", + "neighbouring": "neighboring", + "neighbourliness": "neighborliness", + "neighbourly": "neighborly", + "neighbours": "neighbors", + "neutralisation": "neutralization", + "neutralise": "neutralize", + "neutralised": "neutralized", + "neutralises": "neutralizes", + "neutralising": "neutralizing", + "normalisation": "normalization", + "normalise": "normalize", + "normalised": "normalized", + "normalises": "normalizes", + "normalising": "normalizing", + "odour": "odor", + "odourless": "odorless", + "odours": "odors", + "oesophagus": "esophagus", + "oesophaguses": "esophaguses", + "oestrogen": "estrogen", + "offence": "offense", + "offences": "offenses", + "omelette": "omelet", + "omelettes": "omelets", + "optimise": "optimize", + "optimised": "optimized", + "optimises": "optimizes", + "optimising": "optimizing", + "organisation": "organization", + "organisational": "organizational", + "organisations": "organizations", + "organise": "organize", + "organised": "organized", + "organiser": "organizer", + "organisers": "organizers", + "organises": "organizes", + "organising": "organizing", + "orthopaedic": "orthopedic", + "orthopaedics": "orthopedics", + "ostracise": "ostracize", + "ostracised": "ostracized", + "ostracises": "ostracizes", + "ostracising": "ostracizing", + "outmanoeuvre": "outmaneuver", + "outmanoeuvred": "outmaneuvered", + "outmanoeuvres": "outmaneuvers", + "outmanoeuvring": "outmaneuvering", + "overemphasise": "overemphasize", + "overemphasised": "overemphasized", + "overemphasises": "overemphasizes", + "overemphasising": "overemphasizing", + "oxidisation": "oxidization", + "oxidise": "oxidize", + "oxidised": "oxidized", + "oxidises": "oxidizes", + "oxidising": "oxidizing", + "paederast": "pederast", + "paederasts": "pederasts", + "paediatric": "pediatric", + "paediatrician": "pediatrician", + "paediatricians": "pediatricians", + "paediatrics": "pediatrics", + "paedophile": "pedophile", + "paedophiles": "pedophiles", + "paedophilia": "pedophilia", + "palaeolithic": "paleolithic", + "palaeontologist": "paleontologist", + "palaeontologists": "paleontologists", + "palaeontology": "paleontology", + "panelled": "paneled", + "panelling": "paneling", + "panellist": "panelist", + "panellists": "panelists", + "paralyse": "paralyze", + "paralysed": "paralyzed", + "paralyses": "paralyzes", + "paralysing": "paralyzing", + "parcelled": "parceled", + "parcelling": "parceling", + "parlour": "parlor", + "parlours": "parlors", + "particularise": "particularize", + "particularised": "particularized", + "particularises": "particularizes", + "particularising": "particularizing", + "passivisation": "passivization", + "passivise": "passivize", + "passivised": "passivized", + "passivises": "passivizes", + "passivising": "passivizing", + "pasteurisation": "pasteurization", + "pasteurise": "pasteurize", + "pasteurised": "pasteurized", + "pasteurises": "pasteurizes", + "pasteurising": "pasteurizing", + "patronise": "patronize", + "patronised": "patronized", + "patronises": "patronizes", + "patronising": "patronizing", + "patronisingly": "patronizingly", + "pedalled": "pedaled", + "pedalling": "pedaling", + "pedestrianisation": "pedestrianization", + "pedestrianise": "pedestrianize", + "pedestrianised": "pedestrianized", + "pedestrianises": "pedestrianizes", + "pedestrianising": "pedestrianizing", + "penalise": "penalize", + "penalised": "penalized", + "penalises": "penalizes", + "penalising": "penalizing", + "pencilled": "penciled", + "pencilling": "penciling", + "personalise": "personalize", + "personalised": "personalized", + "personalises": "personalizes", + "personalising": "personalizing", + "pharmacopoeia": "pharmacopeia", + "pharmacopoeias": "pharmacopeias", + "philosophise": "philosophize", + "philosophised": "philosophized", + "philosophises": "philosophizes", + "philosophising": "philosophizing", + "philtre": "filter", + "philtres": "filters", + "phoney": "phony", + "plagiarise": "plagiarize", + "plagiarised": "plagiarized", + "plagiarises": "plagiarizes", + "plagiarising": "plagiarizing", + "plough": "plow", + "ploughed": "plowed", + "ploughing": "plowing", + "ploughman": "plowman", + "ploughmen": "plowmen", + "ploughs": "plows", + "ploughshare": "plowshare", + "ploughshares": "plowshares", + "polarisation": "polarization", + "polarise": "polarize", + "polarised": "polarized", + "polarises": "polarizes", + "polarising": "polarizing", + "politicisation": "politicization", + "politicise": "politicize", + "politicised": "politicized", + "politicises": "politicizes", + "politicising": "politicizing", + "popularisation": "popularization", + "popularise": "popularize", + "popularised": "popularized", + "popularises": "popularizes", + "popularising": "popularizing", + "pouffe": "pouf", + "pouffes": "poufs", + "practise": "practice", + "practised": "practiced", + "practises": "practices", + "practising": "practicing", + "praesidium": "presidium", + "praesidiums": "presidiums", + "pressurisation": "pressurization", + "pressurise": "pressurize", + "pressurised": "pressurized", + "pressurises": "pressurizes", + "pressurising": "pressurizing", + "pretence": "pretense", + "pretences": "pretenses", + "primaeval": "primeval", + "prioritisation": "prioritization", + "prioritise": "prioritize", + "prioritised": "prioritized", + "prioritises": "prioritizes", + "prioritising": "prioritizing", + "privatisation": "privatization", + "privatisations": "privatizations", + "privatise": "privatize", + "privatised": "privatized", + "privatises": "privatizes", + "privatising": "privatizing", + "professionalisation": "professionalization", + "professionalise": "professionalize", + "professionalised": "professionalized", + "professionalises": "professionalizes", + "professionalising": "professionalizing", + "programme": "program", + "programmes": "programs", + "prologue": "prolog", + "prologues": "prologs", + "propagandise": "propagandize", + "propagandised": "propagandized", + "propagandises": "propagandizes", + "propagandising": "propagandizing", + "proselytise": "proselytize", + "proselytised": "proselytized", + "proselytiser": "proselytizer", + "proselytisers": "proselytizers", + "proselytises": "proselytizes", + "proselytising": "proselytizing", + "psychoanalyse": "psychoanalyze", + "psychoanalysed": "psychoanalyzed", + "psychoanalyses": "psychoanalyzes", + "psychoanalysing": "psychoanalyzing", + "publicise": "publicize", + "publicised": "publicized", + "publicises": "publicizes", + "publicising": "publicizing", + "pulverisation": "pulverization", + "pulverise": "pulverize", + "pulverised": "pulverized", + "pulverises": "pulverizes", + "pulverising": "pulverizing", + "pummelled": "pummel", + "pummelling": "pummeled", + "pyjama": "pajama", + "pyjamas": "pajamas", + "pzazz": "pizzazz", + "quarrelled": "quarreled", + "quarrelling": "quarreling", + "radicalise": "radicalize", + "radicalised": "radicalized", + "radicalises": "radicalizes", + "radicalising": "radicalizing", + "rancour": "rancor", + "randomise": "randomize", + "randomised": "randomized", + "randomises": "randomizes", + "randomising": "randomizing", + "rationalisation": "rationalization", + "rationalisations": "rationalizations", + "rationalise": "rationalize", + "rationalised": "rationalized", + "rationalises": "rationalizes", + "rationalising": "rationalizing", + "ravelled": "raveled", + "ravelling": "raveling", + "realisable": "realizable", + "realisation": "realization", + "realisations": "realizations", + "realise": "realize", + "realised": "realized", + "realises": "realizes", + "realising": "realizing", + "recognisable": "recognizable", + "recognisably": "recognizably", + "recognisance": "recognizance", + "recognise": "recognize", + "recognised": "recognized", + "recognises": "recognizes", + "recognising": "recognizing", + "reconnoitre": "reconnoiter", + "reconnoitred": "reconnoitered", + "reconnoitres": "reconnoiters", + "reconnoitring": "reconnoitering", + "refuelled": "refueled", + "refuelling": "refueling", + "regularisation": "regularization", + "regularise": "regularize", + "regularised": "regularized", + "regularises": "regularizes", + "regularising": "regularizing", + "remodelled": "remodeled", + "remodelling": "remodeling", + "remould": "remold", + "remoulded": "remolded", + "remoulding": "remolding", + "remoulds": "remolds", + "reorganisation": "reorganization", + "reorganisations": "reorganizations", + "reorganise": "reorganize", + "reorganised": "reorganized", + "reorganises": "reorganizes", + "reorganising": "reorganizing", + "revelled": "reveled", + "reveller": "reveler", + "revellers": "revelers", + "revelling": "reveling", + "revitalise": "revitalize", + "revitalised": "revitalized", + "revitalises": "revitalizes", + "revitalising": "revitalizing", + "revolutionise": "revolutionize", + "revolutionised": "revolutionized", + "revolutionises": "revolutionizes", + "revolutionising": "revolutionizing", + "rhapsodise": "rhapsodize", + "rhapsodised": "rhapsodized", + "rhapsodises": "rhapsodizes", + "rhapsodising": "rhapsodizing", + "rigour": "rigor", + "rigours": "rigors", + "ritualised": "ritualized", + "rivalled": "rivaled", + "rivalling": "rivaling", + "romanticise": "romanticize", + "romanticised": "romanticized", + "romanticises": "romanticizes", + "romanticising": "romanticizing", + "rumour": "rumor", + "rumoured": "rumored", + "rumours": "rumors", + "sabre": "saber", + "sabres": "sabers", + "saltpetre": "saltpeter", + "sanitise": "sanitize", + "sanitised": "sanitized", + "sanitises": "sanitizes", + "sanitising": "sanitizing", + "satirise": "satirize", + "satirised": "satirized", + "satirises": "satirizes", + "satirising": "satirizing", + "saviour": "savior", + "saviours": "saviors", + "savour": "savor", + "savoured": "savored", + "savouries": "savories", + "savouring": "savoring", + "savours": "savors", + "savoury": "savory", + "scandalise": "scandalize", + "scandalised": "scandalized", + "scandalises": "scandalizes", + "scandalising": "scandalizing", + "sceptic": "skeptic", + "sceptical": "skeptical", + "sceptically": "skeptically", + "scepticism": "skepticism", + "sceptics": "skeptics", + "sceptre": "scepter", + "sceptres": "scepters", + "scrutinise": "scrutinize", + "scrutinised": "scrutinized", + "scrutinises": "scrutinizes", + "scrutinising": "scrutinizing", + "secularisation": "secularization", + "secularise": "secularize", + "secularised": "secularized", + "secularises": "secularizes", + "secularising": "secularizing", + "sensationalise": "sensationalize", + "sensationalised": "sensationalized", + "sensationalises": "sensationalizes", + "sensationalising": "sensationalizing", + "sensitise": "sensitize", + "sensitised": "sensitized", + "sensitises": "sensitizes", + "sensitising": "sensitizing", + "sentimentalise": "sentimentalize", + "sentimentalised": "sentimentalized", + "sentimentalises": "sentimentalizes", + "sentimentalising": "sentimentalizing", + "sepulchre": "sepulcher", + "sepulchres": "sepulchers", + "serialisation": "serialization", + "serialisations": "serializations", + "serialise": "serialize", + "serialised": "serialized", + "serialises": "serializes", + "serialising": "serializing", + "sermonise": "sermonize", + "sermonised": "sermonized", + "sermonises": "sermonizes", + "sermonising": "sermonizing", + "sheikh": "sheik", + "shovelled": "shoveled", + "shovelling": "shoveling", + "shrivelled": "shriveled", + "shrivelling": "shriveling", + "signalise": "signalize", + "signalised": "signalized", + "signalises": "signalizes", + "signalising": "signalizing", + "signalled": "signaled", + "signalling": "signaling", + "smoulder": "smolder", + "smouldered": "smoldered", + "smouldering": "smoldering", + "smoulders": "smolders", + "snivelled": "sniveled", + "snivelling": "sniveling", + "snorkelled": "snorkeled", + "snorkelling": "snorkeling", + "snowplough": "snowplow", + "snowploughs": "snowplow", + "socialisation": "socialization", + "socialise": "socialize", + "socialised": "socialized", + "socialises": "socializes", + "socialising": "socializing", + "sodomise": "sodomize", + "sodomised": "sodomized", + "sodomises": "sodomizes", + "sodomising": "sodomizing", + "solemnise": "solemnize", + "solemnised": "solemnized", + "solemnises": "solemnizes", + "solemnising": "solemnizing", + "sombre": "somber", + "specialisation": "specialization", + "specialisations": "specializations", + "specialise": "specialize", + "specialised": "specialized", + "specialises": "specializes", + "specialising": "specializing", + "spectre": "specter", + "spectres": "specters", + "spiralled": "spiraled", + "spiralling": "spiraling", + "splendour": "splendor", + "splendours": "splendors", + "squirrelled": "squirreled", + "squirrelling": "squirreling", + "stabilisation": "stabilization", + "stabilise": "stabilize", + "stabilised": "stabilized", + "stabiliser": "stabilizer", + "stabilisers": "stabilizers", + "stabilises": "stabilizes", + "stabilising": "stabilizing", + "standardisation": "standardization", + "standardise": "standardize", + "standardised": "standardized", + "standardises": "standardizes", + "standardising": "standardizing", + "stencilled": "stenciled", + "stencilling": "stenciling", + "sterilisation": "sterilization", + "sterilisations": "sterilizations", + "sterilise": "sterilize", + "sterilised": "sterilized", + "steriliser": "sterilizer", + "sterilisers": "sterilizers", + "sterilises": "sterilizes", + "sterilising": "sterilizing", + "stigmatisation": "stigmatization", + "stigmatise": "stigmatize", + "stigmatised": "stigmatized", + "stigmatises": "stigmatizes", + "stigmatising": "stigmatizing", + "storey": "story", + "storeys": "stories", + "subsidisation": "subsidization", + "subsidise": "subsidize", + "subsidised": "subsidized", + "subsidiser": "subsidizer", + "subsidisers": "subsidizers", + "subsidises": "subsidizes", + "subsidising": "subsidizing", + "succour": "succor", + "succoured": "succored", + "succouring": "succoring", + "succours": "succors", + "sulphate": "sulfate", + "sulphates": "sulfates", + "sulphide": "sulfide", + "sulphides": "sulfides", + "sulphur": "sulfur", + "sulphurous": "sulfurous", + "summarise": "summarize", + "summarised": "summarized", + "summarises": "summarizes", + "summarising": "summarizing", + "swivelled": "swiveled", + "swivelling": "swiveling", + "symbolise": "symbolize", + "symbolised": "symbolized", + "symbolises": "symbolizes", + "symbolising": "symbolizing", + "sympathise": "sympathize", + "sympathised": "sympathized", + "sympathiser": "sympathizer", + "sympathisers": "sympathizers", + "sympathises": "sympathizes", + "sympathising": "sympathizing", + "synchronisation": "synchronization", + "synchronise": "synchronize", + "synchronised": "synchronized", + "synchronises": "synchronizes", + "synchronising": "synchronizing", + "synthesise": "synthesize", + "synthesised": "synthesized", + "synthesiser": "synthesizer", + "synthesisers": "synthesizers", + "synthesises": "synthesizes", + "synthesising": "synthesizing", + "syphon": "siphon", + "syphoned": "siphoned", + "syphoning": "siphoning", + "syphons": "siphons", + "systematisation": "systematization", + "systematise": "systematize", + "systematised": "systematized", + "systematises": "systematizes", + "systematising": "systematizing", + "tantalise": "tantalize", + "tantalised": "tantalized", + "tantalises": "tantalizes", + "tantalising": "tantalizing", + "tantalisingly": "tantalizingly", + "tasselled": "tasseled", + "technicolour": "technicolor", + "temporise": "temporize", + "temporised": "temporized", + "temporises": "temporizes", + "temporising": "temporizing", + "tenderise": "tenderize", + "tenderised": "tenderized", + "tenderises": "tenderizes", + "tenderising": "tenderizing", + "terrorise": "terrorize", + "terrorised": "terrorized", + "terrorises": "terrorizes", + "terrorising": "terrorizing", + "theatre": "theater", + "theatregoer": "theatergoer", + "theatregoers": "theatergoers", + "theatres": "theaters", + "theorise": "theorize", + "theorised": "theorized", + "theorises": "theorizes", + "theorising": "theorizing", + "tonne": "ton", + "tonnes": "tons", + "towelled": "toweled", + "towelling": "toweling", + "toxaemia": "toxemia", + "tranquillise": "tranquilize", + "tranquillised": "tranquilized", + "tranquilliser": "tranquilizer", + "tranquillisers": "tranquilizers", + "tranquillises": "tranquilizes", + "tranquillising": "tranquilizing", + "tranquillity": "tranquility", + "tranquillize": "tranquilize", + "tranquillized": "tranquilized", + "tranquillizer": "tranquilizer", + "tranquillizers": "tranquilizers", + "tranquillizes": "tranquilizes", + "tranquillizing": "tranquilizing", + "tranquilly": "tranquility", + "transistorised": "transistorized", + "traumatise": "traumatize", + "traumatised": "traumatized", + "traumatises": "traumatizes", + "traumatising": "traumatizing", + "travelled": "traveled", + "traveller": "traveler", + "travellers": "travelers", + "travelling": "traveling", + "travelog": "travelogue", + "travelogs": "travelogues", + "trialled": "trialed", + "trialling": "trialing", + "tricolour": "tricolor", + "tricolours": "tricolors", + "trivialise": "trivialize", + "trivialised": "trivialized", + "trivialises": "trivializes", + "trivialising": "trivializing", + "tumour": "tumor", + "tumours": "tumors", + "tunnelled": "tunneled", + "tunnelling": "tunneling", + "tyrannise": "tyrannize", + "tyrannised": "tyrannized", + "tyrannises": "tyrannizes", + "tyrannising": "tyrannizing", + "tyre": "tire", + "tyres": "tires", + "unauthorised": "unauthorized", + "uncivilised": "uncivilized", + "underutilised": "underutilized", + "unequalled": "unequaled", + "unfavourable": "unfavorable", + "unfavourably": "unfavorably", + "unionisation": "unionization", + "unionise": "unionize", + "unionised": "unionized", + "unionises": "unionizes", + "unionising": "unionizing", + "unorganised": "unorganized", + "unravelled": "unraveled", + "unravelling": "unraveling", + "unrecognisable": "unrecognizable", + "unrecognised": "unrecognized", + "unrivalled": "unrivaled", + "unsavoury": "unsavory", + "untrammelled": "untrammeled", + "urbanisation": "urbanization", + "urbanise": "urbanize", + "urbanised": "urbanized", + "urbanises": "urbanizes", + "urbanising": "urbanizing", + "utilisable": "utilizable", + "utilisation": "utilization", + "utilise": "utilize", + "utilised": "utilized", + "utilises": "utilizes", + "utilising": "utilizing", + "valour": "valor", + "vandalise": "vandalize", + "vandalised": "vandalized", + "vandalises": "vandalizes", + "vandalising": "vandalizing", + "vaporisation": "vaporization", + "vaporise": "vaporize", + "vaporised": "vaporized", + "vaporises": "vaporizes", + "vaporising": "vaporizing", + "vapour": "vapor", + "vapours": "vapors", + "verbalise": "verbalize", + "verbalised": "verbalized", + "verbalises": "verbalizes", + "verbalising": "verbalizing", + "victimisation": "victimization", + "victimise": "victimize", + "victimised": "victimized", + "victimises": "victimizes", + "victimising": "victimizing", + "videodisc": "videodisk", + "videodiscs": "videodisks", + "vigour": "vigor", + "visualisation": "visualization", + "visualisations": "visualizations", + "visualise": "visualize", + "visualised": "visualized", + "visualises": "visualizes", + "visualising": "visualizing", + "vocalisation": "vocalization", + "vocalisations": "vocalizations", + "vocalise": "vocalize", + "vocalised": "vocalized", + "vocalises": "vocalizes", + "vocalising": "vocalizing", + "vulcanised": "vulcanized", + "vulgarisation": "vulgarization", + "vulgarise": "vulgarize", + "vulgarised": "vulgarized", + "vulgarises": "vulgarizes", + "vulgarising": "vulgarizing", + "waggon": "wagon", + "waggons": "wagons", + "watercolour": "watercolor", + "watercolours": "watercolors", + "weaselled": "weaseled", + "weaselling": "weaseling", + "westernisation": "westernization", + "westernise": "westernize", + "westernised": "westernized", + "westernises": "westernizes", + "westernising": "westernizing", + "womanise": "womanize", + "womanised": "womanized", + "womaniser": "womanizer", + "womanisers": "womanizers", + "womanises": "womanizes", + "womanising": "womanizing", + "woollen": "woolen", + "woollens": "woolens", + "woollies": "woolies", + "woolly": "wooly", + "worshipped": "worshiped", + "worshipper": "worshiper", + "worshipping": "worshiping", + "yodelled": "yodeled", + "yodelling": "yodeling", + "yoghourt": "yogurt", + "yoghourts": "yogurts", + "yoghurt": "yogurt", + "yoghurts": "yogurts", +] diff --git a/Tests/WhisperKitTests/Evaluate/WERUtils.swift b/Tests/WhisperKitTests/Evaluate/WERUtils.swift new file mode 100644 index 0000000..6e20e98 --- /dev/null +++ b/Tests/WhisperKitTests/Evaluate/WERUtils.swift @@ -0,0 +1,126 @@ +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + +import Foundation + +/// Return the operations needed to transform s1 into s2 using Wagner-Fischer algo. +/// "i" = insertion, "d" = deletion, "r" = replacement +enum EditOp: UInt8 { + case blank + case replace + case delete + case insert +} + +enum WERUtils { + static func wordsToChars(reference: [[String]], hypothesis: [[String]]) -> ([String], [String]) { + // tokenize each word into an integer + let vocabulary = Set((reference + hypothesis).flatMap { $0 }) + let word2char = Dictionary(uniqueKeysWithValues: vocabulary.enumerated().map { index, value in + (value, index) + }) + + let referenceCharsEfficient = reference.map { sentence in + String(sentence.lazy.compactMap { word in + if let charCode = word2char[word], let unicodeScalar = UnicodeScalar(charCode) { + return Character(unicodeScalar) + } + return nil + }) + } + + let hypothesisCharsEfficient = hypothesis.map { sentence in + String(sentence.lazy.compactMap { word in + if let charCode = word2char[word], let unicodeScalar = UnicodeScalar(charCode) { + return Character(unicodeScalar) + } + return nil + }) + } + + return (referenceCharsEfficient, hypothesisCharsEfficient) + } + + static func processWords(reference: [String], hypothesis: [String]) -> (Double, [[String?]]) { + var refTransformed = NormalizationUtils.removeMultipleSpaces(sentences: reference) + refTransformed = NormalizationUtils.strip(sentences: refTransformed) + let refTransformedReduced = NormalizationUtils.reduceToListOfListOfWordsWithSpaces(sentences: refTransformed) + + var hypTransformed = NormalizationUtils.removeMultipleSpaces(sentences: hypothesis) + hypTransformed = NormalizationUtils.strip(sentences: hypTransformed) + let hypTransformedReduced = NormalizationUtils.reduceToListOfListOfWordsWithSpaces(sentences: hypTransformed) + + let (refAsChars, hypAsChars) = WERUtils.wordsToChars(reference: refTransformedReduced, hypothesis: hypTransformedReduced) + + let refArrays = refAsChars.map { Array($0.unicodeScalars) } + let hypArrays = hypAsChars.map { Array($0.unicodeScalars) } + + var (numHits, numSubstitutions, numDeletions, numInsertions) = (0, 0, 0, 0) + var (numRfWords, numHypWords) = (0, 0) + var diffResult: [[String?]] = [] + + for (referenceSentence, hypothesisSentence) in zip(refArrays, hypArrays) { + let editOps = levenshtein(referenceSentence, hypothesisSentence) + + // count the number of edits of each type + var substitutions = 0 + var deletions = 0 + var insertions = 0 + + var referenceIndex = 0 + var hypothesisIndex = 0 + for op in editOps { + switch op { + case .replace: + diffResult.append([String(refTransformedReduced[0][referenceIndex]), "-"]) + diffResult.append([String(hypTransformedReduced[0][hypothesisIndex]), "+"]) + substitutions += 1 + referenceIndex += 1 + hypothesisIndex += 1 + case .delete: + diffResult.append([String(refTransformedReduced[0][referenceIndex]), "-"]) + deletions += 1 + referenceIndex += 1 + case .insert: + diffResult.append([String(hypTransformedReduced[0][hypothesisIndex]), "+"]) + insertions += 1 + hypothesisIndex += 1 + case .blank: + diffResult.append([String(refTransformedReduced[0][referenceIndex]), nil]) + referenceIndex += 1 + hypothesisIndex += 1 + } + } + + let hits: Int = referenceSentence.count - (substitutions + deletions) + + numHits += hits + numSubstitutions += substitutions + numDeletions += deletions + numInsertions += insertions + numRfWords += referenceSentence.count + numHypWords += hypothesisSentence.count + } + + let wer = Double(numSubstitutions + numDeletions + numInsertions) / Double(numHits + numSubstitutions + numDeletions) + + return (wer, diffResult) + } + + static func evaluate(originalTranscript: String, generatedTranscript: String, normalizeOriginal: Bool = true) -> (wer: Double, diff: [[String?]]) { + let normalizer = EnglishTextNormalizer() + let reference = normalizeOriginal ? normalizer.normalize(text: originalTranscript) : originalTranscript + let hypothesis = normalizer.normalize(text: generatedTranscript) + + let (wer, diff) = WERUtils.processWords( + reference: [reference], + hypothesis: [hypothesis] + ) + return (wer, diff) + } + + static func processDiff(originalTranscript: String, generatedTranscript: String) -> [[String?]] { + let (_, diff) = evaluate(originalTranscript: originalTranscript, generatedTranscript: generatedTranscript) + return diff + } +} diff --git a/Tests/WhisperKitTests/FunctionalTests.swift b/Tests/WhisperKitTests/FunctionalTests.swift index e1d74fa..2438364 100644 --- a/Tests/WhisperKitTests/FunctionalTests.swift +++ b/Tests/WhisperKitTests/FunctionalTests.swift @@ -22,7 +22,7 @@ final class FunctionalTests: XCTestCase { measureOptions.iterationCount = 5 let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) @@ -49,7 +49,7 @@ final class FunctionalTests: XCTestCase { measureOptions.iterationCount = 5 let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) @@ -68,7 +68,7 @@ final class FunctionalTests: XCTestCase { func testBaseImplementation() throws { let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) @@ -86,7 +86,7 @@ final class FunctionalTests: XCTestCase { func testAsyncImplementation() async throws { let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) let whisperKit = try await WhisperKit(WhisperKitConfig(model: "large-v3")) @@ -98,15 +98,15 @@ final class FunctionalTests: XCTestCase { func testBatchTranscribeAudioPaths() async throws { let audioPaths = try [ XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ), XCTUnwrap( - Bundle.module.path(forResource: "es_test_clip", ofType: "wav"), + Bundle.current.path(forResource: "es_test_clip", ofType: "wav"), "Audio file not found" ), XCTUnwrap( - Bundle.module.path(forResource: "ja_test_clip", ofType: "wav"), + Bundle.current.path(forResource: "ja_test_clip", ofType: "wav"), "Audio file not found" ), ] @@ -133,7 +133,7 @@ final class FunctionalTests: XCTestCase { let audioPaths = try [ "/path/to/file1.wav", XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ), "/path/to/file2.wav", @@ -159,15 +159,15 @@ final class FunctionalTests: XCTestCase { func testBatchTranscribeAudioArrays() async throws { let audioPaths = try [ XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ), XCTUnwrap( - Bundle.module.path(forResource: "es_test_clip", ofType: "wav"), + Bundle.current.path(forResource: "es_test_clip", ofType: "wav"), "Audio file not found" ), XCTUnwrap( - Bundle.module.path(forResource: "ja_test_clip", ofType: "wav"), + Bundle.current.path(forResource: "ja_test_clip", ofType: "wav"), "Audio file not found" ), ] @@ -196,7 +196,7 @@ final class FunctionalTests: XCTestCase { func testModelSearchPathLarge() async throws { let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) diff --git a/Tests/WhisperKitTests/MemoryTestUtils.swift b/Tests/WhisperKitTests/MemoryTestUtils.swift deleted file mode 100644 index 6a6f403..0000000 --- a/Tests/WhisperKitTests/MemoryTestUtils.swift +++ /dev/null @@ -1,182 +0,0 @@ -import Foundation -import WhisperKit - -// MARK: RegressionStats - -class RegressionStats: JSONCodable { - let testInfo: TestInfo - let memoryStats: MemoryStats - let latencyStats: LatencyStats - - init(testInfo: TestInfo, memoryStats: MemoryStats, latencyStats: LatencyStats) { - self.testInfo = testInfo - self.memoryStats = memoryStats - self.latencyStats = latencyStats - } - - func jsonData() throws -> Data { - return try JSONEncoder().encode(self) - } -} - -// MARK: TestInfo - -class TestInfo: JSONCodable { - let device, audioFile: String - let model: String - let date: String - let timeElapsedInSeconds: TimeInterval - let timings: TranscriptionTimings? - let transcript: String? - - init(device: String, audioFile: String, model: String, date: String, timeElapsedInSeconds: TimeInterval, timings: TranscriptionTimings?, transcript: String?) { - self.device = device - self.audioFile = audioFile - self.model = model - self.date = date - self.timeElapsedInSeconds = timeElapsedInSeconds - self.timings = timings - self.transcript = transcript - } -} - -// MARK: TestReport - -struct TestReport: JSONCodable { - let device: String - let modelsTested: [String] - let failureInfo: [String: String] - - init(device: String, modelsTested: [String], failureInfo: [String: String]) { - self.device = device - self.modelsTested = modelsTested - self.failureInfo = failureInfo - } -} - -// MARK: Stats - -class Stats: JSONCodable { - var measurements: [Measurement] - let units: String - var totalNumberOfMeasurements: Int - - init(measurements: [Measurement], units: String, totalNumberOfMeasurements: Int) { - self.measurements = measurements - self.units = units - self.totalNumberOfMeasurements = totalNumberOfMeasurements - } - - func measure(from values: [Float], timeElapsed: TimeInterval) { - var measurement: Measurement - if let min = values.min(), let max = values.max() { - measurement = Measurement( - min: min, - max: max, - average: values.reduce(0,+) / Float(values.count), - numberOfMeasurements: values.count, - timeElapsed: timeElapsed - ) - self.measurements.append(measurement) - self.totalNumberOfMeasurements += values.count - } - } -} - -// MARK: LatencyStats - -class LatencyStats: Stats { - override init(measurements: [Measurement] = [], units: String, totalNumberOfMeasurements: Int = 0) { - super.init(measurements: measurements, units: units, totalNumberOfMeasurements: totalNumberOfMeasurements) - } - - required init(from decoder: any Decoder) throws { - fatalError("init(from:) has not been implemented") - } - - func calculate(from total: Double, runs: Int) -> Double { - return runs > 0 ? total / Double(runs) : -1 - } -} - -class MemoryStats: Stats { - var preTranscribeMemory: Float - var postTranscribeMemory: Float - - init(measurements: [Measurement] = [], units: String, totalNumberOfMeasurements: Int = 0, preTranscribeMemory: Float, postTranscribeMemory: Float) { - self.preTranscribeMemory = preTranscribeMemory - self.postTranscribeMemory = postTranscribeMemory - super.init(measurements: measurements, units: units, totalNumberOfMeasurements: totalNumberOfMeasurements) - } - - required init(from decoder: any Decoder) throws { - fatalError("init(from:) has not been implemented") - } - - /// Implement the encode(to:) method - override func encode(to encoder: Encoder) throws { - var container = encoder.container(keyedBy: CodingKeys.self) - try super.encode(to: encoder) - try container.encode(preTranscribeMemory, forKey: .preTranscribeMemory) - try container.encode(postTranscribeMemory, forKey: .postTranscribeMemory) - } - - /// Coding keys for MemoryStats properties - enum CodingKeys: String, CodingKey { - case preTranscribeMemory - case postTranscribeMemory - } -} - -struct Measurement: JSONCodable { - let min, max, average: Float - let numberOfMeasurements: Int - let timeElapsed: TimeInterval -} - -protocol JSONCodable: Codable {} - -extension JSONCodable { - func jsonData() throws -> Data { - return try JSONEncoder().encode(self) - } -} - -extension Data { - var prettyPrintedJSONString: NSString? { // NSString gives us a nice sanitized debugDescription - guard let object = try? JSONSerialization.jsonObject(with: self, options: []), - let data = try? JSONSerialization.data(withJSONObject: object, options: [.prettyPrinted, .sortedKeys]), - let prettyPrintedString = NSString(data: data, encoding: String.Encoding.utf8.rawValue) else { return nil } - - return prettyPrintedString - } -} - -// MARK: - SystemMemoryChecker - -@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) -class SystemMemoryChecker: NSObject { - static func getMemoryUsed() -> UInt64 { - // The `TASK_VM_INFO_COUNT` and `TASK_VM_INFO_REV1_COUNT` macros are too - // complex for the Swift C importer, so we have to define them ourselves. - let TASK_VM_INFO_COUNT = mach_msg_type_number_t(MemoryLayout.size / MemoryLayout.size) - guard let offset = MemoryLayout.offset(of: \task_vm_info_data_t.min_address) else { return 0 } - let TASK_VM_INFO_REV1_COUNT = mach_msg_type_number_t(offset / MemoryLayout.size) - var info = task_vm_info_data_t() - var count = TASK_VM_INFO_COUNT - let kr = withUnsafeMutablePointer(to: &info) { infoPtr in - infoPtr.withMemoryRebound(to: integer_t.self, capacity: Int(count)) { intPtr in - task_info(mach_task_self_, task_flavor_t(TASK_VM_INFO), intPtr, &count) - } - } - guard - kr == KERN_SUCCESS, - count >= TASK_VM_INFO_REV1_COUNT - else { return 0 } - - let usedBytes = Float(info.phys_footprint) - let usedBytesInt = UInt64(usedBytes) - let usedMB = usedBytesInt / 1024 / 1024 - return usedMB - } -} diff --git a/Tests/WhisperKitTests/RegressionTestUtils.swift b/Tests/WhisperKitTests/RegressionTestUtils.swift new file mode 100644 index 0000000..07a25db --- /dev/null +++ b/Tests/WhisperKitTests/RegressionTestUtils.swift @@ -0,0 +1,568 @@ +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + +import CoreML +import Foundation +import MachO +import WhisperKit + +#if canImport(UIKit) +import UIKit +#endif + +#if canImport(IOKit) +import IOKit.ps +#endif + +#if os(watchOS) +import WatchKit +#endif + +// MARK: RegressionStats + +@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) +class RegressionStats: JSONCodable { + let testInfo: TestInfo + let memoryStats: MemoryStats + let latencyStats: LatencyStats + let staticAttributes: StaticAttributes + let systemMeasurements: SystemMeasurements + + init(testInfo: TestInfo, + memoryStats: MemoryStats, + latencyStats: LatencyStats, + staticAttributes: StaticAttributes, + systemMeasurements: SystemMeasurements) + { + self.testInfo = testInfo + self.memoryStats = memoryStats + self.latencyStats = latencyStats + self.staticAttributes = staticAttributes + self.systemMeasurements = systemMeasurements + } + + func jsonData() throws -> Data { + return try JSONEncoder().encode(self) + } +} + +// MARK: TestInfo + +class TestInfo: JSONCodable { + let device: String + let audioFile: String + let datasetDir: String + let datasetRepo: String + let model: String + let modelSizeMB: Double + let date: String + let timeElapsedInSeconds: TimeInterval + let timings: TranscriptionTimings? + let prediction: String? + let reference: String? + let wer: Double + let diff: [[String?]] + + init( + device: String, + audioFile: String, + datasetDir: String, + datasetRepo: String, + model: String, + modelSizeMB: Double, + date: String, + timeElapsedInSeconds: TimeInterval, + timings: TranscriptionTimings?, + prediction: String?, + reference: String?, + wer: Double, + diff: [[String?]] + ) { + self.device = device + self.audioFile = audioFile + self.datasetDir = datasetDir + self.datasetRepo = datasetRepo + self.model = model + self.modelSizeMB = modelSizeMB + self.date = date + self.timeElapsedInSeconds = timeElapsedInSeconds + self.timings = timings + self.prediction = prediction + self.reference = reference + self.wer = wer + self.diff = diff + } +} + +// MARK: TestReport + +struct TestReport: JSONCodable { + let deviceModel: String + let osType: String + let osVersion: String + let modelsTested: [String] + let failureInfo: [String: String] + let attachments: [String: String] + + init( + deviceModel: String, + osType: String, + osVersion: String, + modelsTested: [String], + failureInfo: [String: String], + attachments: [String: String] + ) { + self.deviceModel = deviceModel + self.osType = osType + self.osVersion = osVersion + self.modelsTested = modelsTested + self.failureInfo = failureInfo + self.attachments = attachments + } +} + +// MARK: Stats + +class Stats: JSONCodable { + var measurements: [TestMeasurement] + let units: String + var totalNumberOfMeasurements: Int + + init(measurements: [TestMeasurement], units: String, totalNumberOfMeasurements: Int) { + self.measurements = measurements + self.units = units + self.totalNumberOfMeasurements = totalNumberOfMeasurements + } + + func measure(from values: [Float], tokenCount: Int, timeElapsed: TimeInterval) { + var measurement: TestMeasurement + if let min = values.min(), let max = values.max() { + measurement = TestMeasurement( + min: min, + max: max, + average: values.reduce(0,+) / Float(values.count), + numberOfMeasurements: values.count, + cumulativeTokens: tokenCount, + timeElapsed: timeElapsed + ) + self.measurements.append(measurement) + self.totalNumberOfMeasurements += values.count + } + } +} + +// MARK: StaticAttributes + +@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) +class StaticAttributes: Codable { + let osVersion: String + let isLowPowerMode: String + let encoderCompute: String + let decoderCompute: String + let decodingOptions: DecodingOptions + + init(encoderCompute: MLComputeUnits, decoderCompute: MLComputeUnits, decodingOptions: DecodingOptions) { + let version = ProcessInfo.processInfo.operatingSystemVersion + self.osVersion = "\(version.majorVersion).\(version.minorVersion).\(version.patchVersion)" + self.isLowPowerMode = ProcessInfo.processInfo.isLowPowerModeEnabled ? "Enabled" : "Disabled" + self.encoderCompute = encoderCompute.stringValue + self.decoderCompute = decoderCompute.stringValue + self.decodingOptions = decodingOptions + } +} + +class SystemMeasurements: Codable { + let systemMemory: [SystemMemoryUsage] + let diskSpace: [DiskSpace] + let batteryLevel: [Float] + let thermalState: [Int] + let timeElapsed: [TimeInterval] + + init(systemMemory: [SystemMemoryUsage], diskSpace: [DiskSpace], batteryLevel: [Float], thermalState: [Int], timeElapsed: [TimeInterval]) { + self.systemMemory = systemMemory + self.diskSpace = diskSpace + self.batteryLevel = batteryLevel + self.thermalState = thermalState + self.timeElapsed = timeElapsed + } +} + +// MARK: LatencyStats + +class LatencyStats: Stats { + override init(measurements: [TestMeasurement] = [], units: String, totalNumberOfMeasurements: Int = 0) { + super.init(measurements: measurements, units: units, totalNumberOfMeasurements: totalNumberOfMeasurements) + } + + required init(from decoder: any Decoder) throws { + fatalError("init(from:) has not been implemented") + } + + func calculate(from total: Double, runs: Int) -> Double { + return runs > 0 ? total / Double(runs) : -1 + } +} + +class MemoryStats: Stats { + var preTranscribeMemory: Float + var postTranscribeMemory: Float + + init(measurements: [TestMeasurement] = [], units: String, totalNumberOfMeasurements: Int = 0, preTranscribeMemory: Float, postTranscribeMemory: Float) { + self.preTranscribeMemory = preTranscribeMemory + self.postTranscribeMemory = postTranscribeMemory + super.init(measurements: measurements, units: units, totalNumberOfMeasurements: totalNumberOfMeasurements) + } + + required init(from decoder: any Decoder) throws { + fatalError("init(from:) has not been implemented") + } + + /// Implement the encode(to:) method + override func encode(to encoder: Encoder) throws { + var container = encoder.container(keyedBy: CodingKeys.self) + try super.encode(to: encoder) + try container.encode(preTranscribeMemory, forKey: .preTranscribeMemory) + try container.encode(postTranscribeMemory, forKey: .postTranscribeMemory) + } + + /// Coding keys for MemoryStats properties + enum CodingKeys: String, CodingKey { + case preTranscribeMemory + case postTranscribeMemory + } +} + +struct TestMeasurement: JSONCodable { + let min, max, average: Float + let numberOfMeasurements: Int + let cumulativeTokens: Int + let timeElapsed: TimeInterval +} + +protocol JSONCodable: Codable {} + +extension JSONCodable { + func jsonData() throws -> Data { + return try JSONEncoder().encode(self) + } +} + +extension Data { + var prettyPrintedJSONString: NSString? { // NSString gives us a nice sanitized debugDescription + guard let object = try? JSONSerialization.jsonObject(with: self, options: []), + let data = try? JSONSerialization.data(withJSONObject: object, options: [.prettyPrinted, .sortedKeys]), + let prettyPrintedString = NSString(data: data, encoding: String.Encoding.utf8.rawValue) else { return nil } + + return prettyPrintedString + } +} + +// MARK: - SystemMemoryChecker + +@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) +class AppMemoryChecker: NSObject { + static func getMemoryUsed() -> UInt64 { + // The `TASK_VM_INFO_COUNT` and `TASK_VM_INFO_REV1_COUNT` macros are too + // complex for the Swift C importer, so we have to define them ourselves. + let TASK_VM_INFO_COUNT = mach_msg_type_number_t(MemoryLayout.size / MemoryLayout.size) + guard let offset = MemoryLayout.offset(of: \task_vm_info_data_t.min_address) else { return 0 } + let TASK_VM_INFO_REV1_COUNT = mach_msg_type_number_t(offset / MemoryLayout.size) + var info = task_vm_info_data_t() + var count = TASK_VM_INFO_COUNT + let kr = withUnsafeMutablePointer(to: &info) { infoPtr in + infoPtr.withMemoryRebound(to: integer_t.self, capacity: Int(count)) { intPtr in + task_info(mach_task_self_, task_flavor_t(TASK_VM_INFO), intPtr, &count) + } + } + guard + kr == KERN_SUCCESS, + count >= TASK_VM_INFO_REV1_COUNT + else { return 0 } + + let usedBytes = Float(info.phys_footprint) + let usedBytesInt = UInt64(usedBytes) + let usedMB = usedBytesInt / 1024 / 1024 + return usedMB + } +} + +@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) +class SystemMemoryCheckerAdvanced: NSObject { + static func getMemoryUsage() -> SystemMemoryUsage { + // Get total and available memory using host_statistics64 + var stats = vm_statistics64() + var count = mach_msg_type_number_t(MemoryLayout.size(ofValue: stats) / MemoryLayout.size) + let hostPort = mach_host_self() + let result = withUnsafeMutablePointer(to: &stats) { statsPtr -> kern_return_t in + statsPtr.withMemoryRebound(to: integer_t.self, capacity: Int(count)) { intPtr in + host_statistics64(hostPort, HOST_VM_INFO64, intPtr, &count) + } + } + + guard result == KERN_SUCCESS else { + return SystemMemoryUsage(totalAvailableGB: 0, totalUsedGB: 0, totalActiveGB: 0, totalWiredGB: 0, appAllocatedGB: 0, appUsedGB: 0, swapUsedGB: 0) + } + + let pageSize = UInt64(vm_kernel_page_size) + let totalMemory = Float(ProcessInfo.processInfo.physicalMemory) / 1024 / 1024 / 1024 + let freeMemory = Float(stats.free_count) * Float(pageSize) / 1024 / 1024 / 1024 + let inactiveMemory = Float(stats.inactive_count) * Float(pageSize) / 1024 / 1024 / 1024 + let availableMemory = freeMemory + inactiveMemory + let activeMemory = Float(stats.active_count) * Float(pageSize) / 1024 / 1024 / 1024 + let wiredMemory = Float(stats.wire_count) * Float(pageSize) / 1024 / 1024 / 1024 + let usedMemory = totalMemory - availableMemory + + // Get task-specific memory footprint using task_info + let TASK_VM_INFO_COUNT = mach_msg_type_number_t(MemoryLayout.size / MemoryLayout.size) + guard let offset = MemoryLayout.offset(of: \task_vm_info_data_t.min_address) else { + return SystemMemoryUsage(totalAvailableGB: 0, totalUsedGB: 0, totalActiveGB: 0, totalWiredGB: 0, appAllocatedGB: 0, appUsedGB: 0, swapUsedGB: 0) + } + let TASK_VM_INFO_REV1_COUNT = mach_msg_type_number_t(offset / MemoryLayout.size) + var info = task_vm_info_data_t() + var countInfo = TASK_VM_INFO_COUNT + let kr = withUnsafeMutablePointer(to: &info) { infoPtr in + infoPtr.withMemoryRebound(to: integer_t.self, capacity: Int(countInfo)) { intPtr in + task_info(mach_task_self_, task_flavor_t(TASK_VM_INFO), intPtr, &countInfo) + } + } + + guard + kr == KERN_SUCCESS, + countInfo >= TASK_VM_INFO_REV1_COUNT + else { + return SystemMemoryUsage(totalAvailableGB: 0, totalUsedGB: 0, totalActiveGB: 0, totalWiredGB: 0, appAllocatedGB: 0, appUsedGB: 0, swapUsedGB: 0) + } + + let appAllocatedBytes = UInt64(info.phys_footprint) + let appAllocatedGB = Float(appAllocatedBytes) / 1024 / 1024 / 1024 + + let appUsedBytes = UInt64(info.resident_size) + let appUsedGB = Float(appUsedBytes) / 1024 / 1024 / 1024 + + // Get swap memory usage + let swapUsedBytes = UInt64(stats.swapouts) * pageSize + let swapUsedGB = Float(swapUsedBytes) / 1024 / 1024 / 1024 + + return SystemMemoryUsage( + totalAvailableGB: availableMemory, + totalUsedGB: usedMemory, + totalActiveGB: activeMemory, + totalWiredGB: wiredMemory, + appAllocatedGB: appAllocatedGB, + appUsedGB: appUsedGB, + swapUsedGB: swapUsedGB + ) + } +} + +class BatteryLevelChecker: NSObject { + static func getBatteryLevel() -> Float? { + #if os(iOS) || os(visionOS) + UIDevice.current.isBatteryMonitoringEnabled = true + let batteryLevel = UIDevice.current.batteryLevel + UIDevice.current.isBatteryMonitoringEnabled = false + return batteryLevel >= 0 ? batteryLevel * 100 : nil + #elseif os(watchOS) + let batteryLevel = WKInterfaceDevice.current().batteryLevel + return batteryLevel >= 0 ? batteryLevel * 100 : nil + #elseif os(macOS) + return getMacOSBatteryLevel() + #else + return nil + #endif + } + + #if os(macOS) + private static func getMacOSBatteryLevel() -> Float? { + let snapshot = IOPSCopyPowerSourcesInfo().takeRetainedValue() + let sources = IOPSCopyPowerSourcesList(snapshot).takeRetainedValue() as [CFTypeRef] + for ps in sources { + if let description = IOPSGetPowerSourceDescription(snapshot, ps).takeUnretainedValue() as? [String: Any] { + if let currentCapacity = description[kIOPSCurrentCapacityKey] as? Int, + let maxCapacity = description[kIOPSMaxCapacityKey] as? Int + { + return (Float(currentCapacity) / Float(maxCapacity)) * 100 + } + } + } + return nil + } + #endif +} + +class ThermalStateChecker: NSObject { + static func getThermalState() -> Int { + ProcessInfo.processInfo.thermalState.rawValue + } +} + +struct DiskSpace: Codable { + let totalSpaceGB: Float? + let freeSpaceGB: Float? +} + +struct SystemMemoryUsage: Codable { + let totalAvailableGB: Float + let totalUsedGB: Float + let totalActiveGB: Float + let totalWiredGB: Float + let appAllocatedGB: Float + let appUsedGB: Float + let swapUsedGB: Float +} + +class DiskSpaceChecker: NSObject { + static func getDiskSpace() -> DiskSpace { + #if os(iOS) || os(watchOS) || os(visionOS) + return getMobileOSDiskSpace() + #elseif os(macOS) + return getMacOSDiskSpace() + #else + return DiskSpace(totalSpaceGB: nil, freeSpaceGB: nil) + #endif + } + + #if os(iOS) || os(watchOS) || os(visionOS) + private static func getMobileOSDiskSpace() -> DiskSpace { + let fileManager = FileManager.default + do { + let attributes = try fileManager.attributesOfFileSystem(forPath: NSHomeDirectory()) + if let totalSpace = attributes[.systemSize] as? NSNumber, + let freeSpace = attributes[.systemFreeSize] as? NSNumber + { + return DiskSpace( + totalSpaceGB: Float(truncating: totalSpace) / 1024 / 1024 / 1024, + freeSpaceGB: Float(truncating: freeSpace) / 1024 / 1024 / 1024 + ) + } + } catch { + print("Error retrieving file system attributes: \(error)") + } + return DiskSpace(totalSpaceGB: nil, freeSpaceGB: nil) + } + #endif + + #if os(macOS) + private static func getMacOSDiskSpace() -> DiskSpace { + let fileManager = FileManager.default + do { + let homeDirectory = fileManager.homeDirectoryForCurrentUser + let attributes = try fileManager.attributesOfFileSystem(forPath: homeDirectory.path) + if let totalSpace = attributes[.systemSize] as? NSNumber, + let freeSpace = attributes[.systemFreeSize] as? NSNumber + { + return DiskSpace( + totalSpaceGB: Float(truncating: totalSpace) / 1024 / 1024 / 1024, + freeSpaceGB: Float(truncating: freeSpace) / 1024 / 1024 / 1024 + ) + } + } catch { + print("Error retrieving file system attributes: \(error)") + } + return DiskSpace(totalSpaceGB: nil, freeSpaceGB: nil) + } + #endif +} + +private extension MLComputeUnits { + var stringValue: String { + switch self { + case .cpuOnly: + return "CPU Only" + case .cpuAndGPU: + return "CPU and GPU" + case .all: + return "All" + case .cpuAndNeuralEngine: + return "CPU and Neural Engine" + @unknown default: + return "Unknown" + } + } +} + +@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) +actor TranscriptionTestState { + private var aggregatedCount: Double = 0 + private var cumulativeTokenCount: Double = 0 + private var currentAppMemoryValues: [Float] = [] + private var currentTPSValues: [Float] = [] + private var currentTPS: Double = 0 + private let startTime: Date + private let startTimeStamp: CFAbsoluteTime + private let memoryStats: MemoryStats + private let latencyStats: LatencyStats + + init(startTime: Date, startTimeStamp: CFAbsoluteTime, memoryStats: MemoryStats, latencyStats: LatencyStats) { + self.startTime = startTime + self.startTimeStamp = startTimeStamp + self.memoryStats = memoryStats + self.latencyStats = latencyStats + } + + func update(with result: TranscriptionProgress) async { + aggregatedCount += 1 + cumulativeTokenCount += 1 + + let currentMemory = AppMemoryChecker.getMemoryUsed() + let timeTaken = CFAbsoluteTimeGetCurrent() - startTimeStamp + currentTPS = Double(cumulativeTokenCount / timeTaken) + + if currentMemory != 0 { + currentAppMemoryValues.append(Float(currentMemory)) + } + + if !currentTPS.isNaN { + currentTPSValues.append(Float(currentTPS)) + } + + if aggregatedCount >= 100 { + let timeElapsed = Date().timeIntervalSince(startTime) + memoryStats.measure( + from: currentAppMemoryValues, + tokenCount: Int(cumulativeTokenCount), + timeElapsed: timeElapsed + ) + latencyStats.measure( + from: currentTPSValues, + tokenCount: Int(cumulativeTokenCount), + timeElapsed: timeElapsed + ) + currentAppMemoryValues.removeAll() + currentTPSValues.removeAll() + aggregatedCount = 0 + } + } + + func getCurrentTPS() -> Double { + return currentTPS + } + + func processFinalMeasurements() async -> ( + memoryStats: MemoryStats, + latencyStats: LatencyStats + ) { + let timeElapsed = Date().timeIntervalSince(startTime) + + if !currentAppMemoryValues.isEmpty { + memoryStats.measure( + from: currentAppMemoryValues, + tokenCount: Int(cumulativeTokenCount), + timeElapsed: timeElapsed + ) + } + + if !currentTPSValues.isEmpty { + latencyStats.measure( + from: currentTPSValues, + tokenCount: Int(cumulativeTokenCount), + timeElapsed: timeElapsed + ) + } + + return ( + memoryStats, + latencyStats + ) + } +} diff --git a/Tests/WhisperKitTests/RegressionTests.swift b/Tests/WhisperKitTests/RegressionTests.swift index 724eec0..29b8c33 100644 --- a/Tests/WhisperKitTests/RegressionTests.swift +++ b/Tests/WhisperKitTests/RegressionTests.swift @@ -1,173 +1,568 @@ +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + import CoreML +import Foundation import Hub +import UniformTypeIdentifiers import WhisperKit import XCTest +#if os(watchOS) +import WatchKit +#endif + @available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) final class RegressionTests: XCTestCase { - var audioFileURL: URL? + var audioFileURLs: [URL]? + var remoteFileURLs: [URL]? + var metadataURL: URL? + var testWERURLs: [URL]? + var modelsToTest: [String] = [] + var modelsTested: [String] = [] + var optionsToTest: [DecodingOptions] = [DecodingOptions()] + + struct TestConfig { + let dataset: String + let modelComputeOptions: ModelComputeOptions + var model: String + let decodingOptions: DecodingOptions + } + + /// Located on HF https://huggingface.co/datasets/argmaxinc/whisperkit-test-data/tree/main + let datasetRepo = "argmaxinc/whisperkit-test-data" + var datasets = ["librispeech-10mins", "earnings22-10mins"] + let debugDataset = ["earnings22-10mins"] + let debugModels = ["tiny"] + + var computeOptions: [ModelComputeOptions] = [ + ModelComputeOptions(audioEncoderCompute: .cpuAndNeuralEngine, textDecoderCompute: .cpuAndNeuralEngine), + ] + + let defaultDecodingOptions = DecodingOptions( + verbose: true, + task: .transcribe + ) + + let vadDecodingOptions = DecodingOptions( + verbose: true, + task: .transcribe, + concurrentWorkerCount: 4, + chunkingStrategy: .vad + ) override func setUp() { super.setUp() + #if canImport(UIApplication) + NotificationCenter.default.addObserver( + self, + selector: #selector(didReceiveMemoryWarning), + name: UIApplication.didReceiveMemoryWarningNotification, + object: nil + ) + #endif + } - if self.audioFileURL == nil { - let expectation = XCTestExpectation(description: "Download test audio") - downloadTestAudio { success in - if success { - expectation.fulfill() - } else { - XCTFail("Downloading audio file for testing failed") - } + @objc func didReceiveMemoryWarning() { + Logging.debug("Received memory warning") + + // TODO: Record this data in the test results + let maxMemory = SystemMemoryCheckerAdvanced.getMemoryUsage() + Logging.debug("Max memory before warning: \(maxMemory)") + } + + func testEnvConfigurations(defaultModels: [String]? = nil) { + if let modelSizeEnv = ProcessInfo.processInfo.environment["MODEL_NAME"], !modelSizeEnv.isEmpty { + modelsToTest = [modelSizeEnv] + Logging.debug("Model size: \(modelSizeEnv)") + XCTAssertTrue(modelsToTest.count > 0, "Invalid model size: \(modelSizeEnv)") + if modelSizeEnv == "crash_test" { + fatalError("Crash test triggered") } - // Wait for the expectation with a timeout - wait(for: [expectation], timeout: 30) + } else { + modelsToTest = defaultModels ?? debugModels + Logging.debug("Model size not set by env") } } - func downloadTestAudio(completion: @escaping (Bool) -> Void) { - Task { + func testModelPerformanceWithDebugConfig() async throws { + testEnvConfigurations() + + // Debug test matrix + datasets = debugDataset + optionsToTest = [vadDecodingOptions] + computeOptions = [computeOptions.first!] + + let debugTestMatrix: [TestConfig] = getTestMatrix() + Logging.debug("Running \(debugTestMatrix.count) regression tests for models: \(modelsToTest)") + + // Run the tests + try await runRegressionTests(with: debugTestMatrix) + } + + func testModelPerformance() async throws { + testEnvConfigurations(defaultModels: WhisperKit.recommendedModels().supported) + + // Setup test matrix + optionsToTest = [vadDecodingOptions] + computeOptions = [computeOptions.first!] + + let testMatrix: [TestConfig] = getTestMatrix() + Logging.debug("Running \(testMatrix.count) regression tests for models: \(modelsToTest)") + + // Run the tests + try await runRegressionTests(with: testMatrix) + } + + // MARK: - Test Pipeline + + private func runRegressionTests(with testMatrix: [TestConfig]) async throws { + var failureInfo: [String: String] = [:] + var attachments: [String: String] = [:] + let device = getCurrentDevice() + for (i, config) in testMatrix.enumerated() { do { - let earnings22CompressedDataset = Hub.Repo(id: "argmaxinc/whisperkit-test-data", type: .datasets) - let tempPath = FileManager.default.temporaryDirectory - let downloadBase = tempPath.appending(component: "huggingface") - let hubApi = HubApi(downloadBase: downloadBase) - let fileURL = try await hubApi.snapshot(from: earnings22CompressedDataset, matching: ["4484146.mp3"]) - self.audioFileURL = fileURL.appending(component: "4484146.mp3") - completion(true) + Logging.debug("Running test \(i + 1)/\(testMatrix.count) for \(config.model) with \(config.dataset) on \(device) using encoder compute: \(config.modelComputeOptions.audioEncoderCompute.description) and decoder compute: \(config.modelComputeOptions.textDecoderCompute.description)") + let expectation = XCTestExpectation(description: "Download test audio files for \(config.dataset) dataset") + downloadTestData(forDataset: config.dataset) { success in + if success { + expectation.fulfill() + } else { + XCTFail("Downloading audio file for testing failed") + } + } + await fulfillment(of: [expectation], timeout: 300) + let attachmentName = try await testAndMeasureModelPerformance(config: config, device: device) + attachments[config.dataset] = attachmentName + try await Task.sleep(nanoseconds: 1_000_000_000) } catch { - XCTFail("Async setup failed with error: \(error)") - completion(false) + Logging.debug("Failed to test \(config.model): \(error)") + failureInfo[config.model] = error.localizedDescription } } + + // Save summary + saveSummary(failureInfo: failureInfo, attachments: attachments) } - func testAndMeasureModelPerformance(model: String, device: String) async throws { - let audioFilePath = try XCTUnwrap( - self.audioFileURL?.path(), - "Audio file not found" + func testAndMeasureModelPerformance(config: TestConfig, device: String) async throws -> String? { + var config = config + var resultJSON: [RegressionStats] = [] + let audioFilePaths = try XCTUnwrap( + self.audioFileURLs, + "Audio files not found" + ).map { $0.path() } + + if WhisperKit.recommendedModels().disabled.contains(where: { $0.range(of: config.model) != nil }) { + throw WhisperError.modelsUnavailable("Skipping model \(config.model), disabled for \(device).") + } + + // Create WhisperKit instance with checks for memory usage + let whisperKit = try await createWithMemoryCheck( + model: config.model, + computeOptions: config.modelComputeOptions, + verbose: true, + logLevel: .debug ) - let startTime = Date() - let iso8601DateTimeString = ISO8601DateFormatter().string(from: Date()) + if let modelFile = whisperKit.modelFolder?.lastPathComponent { + config.model = modelFile + modelsTested.append(modelFile) + modelsTested = Array(Set(modelsTested)) + } - var currentMemoryValues = [Float]() - var currentTPSValues = [Float]() + for audioFilePath in audioFilePaths { + // Process each audio file + try await processAudioFile( + audioFilePath: audioFilePath, + whisperKit: whisperKit, + config: config, + device: device, + resultJSON: &resultJSON + ) + } - let memoryStats = MemoryStats( + do { + let jsonData = try JSONEncoder().encode(resultJSON) + let attachment = XCTAttachment(data: jsonData, uniformTypeIdentifier: UTType.json.identifier) + let attachmentName = "\(device)_\(config.model)_\(Date().formatted(Date.ISO8601FormatStyle().dateSeparator(.dash).timeSeparator(.omitted)))_\(config.dataset)".replacingOccurrences(of: ".", with: "_") + attachment.name = attachmentName + ".json" + attachment.lifetime = .keepAlways + add(attachment) + return attachmentName + } catch { + XCTFail("Failed with error: \(error)") + return nil + } + } + + func processAudioFile( + audioFilePath: String, + whisperKit: WhisperKit, + config: TestConfig, + device: String, + resultJSON: inout [RegressionStats] + ) async throws { + let startTime = Date() + + // Initialize test state + var memoryStats = MemoryStats( measurements: [], units: "MB", totalNumberOfMeasurements: 0, preTranscribeMemory: -1, postTranscribeMemory: -1 ) - let latencyStats = LatencyStats( + var latencyStats = LatencyStats( measurements: [], units: "Tokens/Sec", totalNumberOfMeasurements: 0 ) - var count = 0 - - let callback = { - (result: TranscriptionProgress) -> Bool in - count += 1 - let currentMemory = SystemMemoryChecker.getMemoryUsed() - let currentTPS = result.timings.tokensPerSecond - if currentMemory != 0 { - currentMemoryValues.append(Float(currentMemory)) - } - if !currentTPS.isNaN { - currentTPSValues.append(Float(currentTPS)) - } - if count % 100 == 1 { - let timeElapsed = Date().timeIntervalSince(startTime) - memoryStats.measure(from: currentMemoryValues, timeElapsed: timeElapsed) - latencyStats.measure(from: currentTPSValues, timeElapsed: timeElapsed) - currentMemoryValues = [] - currentTPSValues = [] + + let startTimeStamp = CFAbsoluteTimeGetCurrent() + let testState = TranscriptionTestState( + startTime: startTime, + startTimeStamp: startTimeStamp, + memoryStats: memoryStats, + latencyStats: latencyStats + ) + + let callback = { (result: TranscriptionProgress) -> Bool in + Task { + await testState.update(with: result) } return true } - let whisperKit = try await WhisperKit(WhisperKitConfig(model: model)) - memoryStats.preTranscribeMemory = Float(SystemMemoryChecker.getMemoryUsed()) + memoryStats.preTranscribeMemory = Float(AppMemoryChecker.getMemoryUsed()) + + var systemMemory: [SystemMemoryUsage] = [] + var diskSpace: [DiskSpace] = [] + var batteryLevel: [Float] = [] + var thermalState: [Int] = [] + var timerTimeElapsed: [TimeInterval] = [] + + // Start your timer + let timerQueue = DispatchQueue(label: "RegressionTimerQueue") + let timer = DispatchSource.makeTimerSource(queue: timerQueue) + timer.schedule(deadline: .now(), repeating: 1.0) + timer.setEventHandler { + systemMemory.append(SystemMemoryCheckerAdvanced.getMemoryUsage()) + diskSpace.append(DiskSpaceChecker.getDiskSpace()) + batteryLevel.append(BatteryLevelChecker.getBatteryLevel() ?? -1) + thermalState.append(ThermalStateChecker.getThermalState()) + timerTimeElapsed.append(Date().timeIntervalSince(startTime)) + } + timer.resume() + + // Perform transcription + let transcriptionResults = try await whisperKit.transcribe( + audioPath: audioFilePath, + decodeOptions: config.decodingOptions, + callback: callback + ) + + let tpsThreshold = 4.0 + let currentTPS = await testState.getCurrentTPS() + if !(currentTPS != 0 && currentTPS > tpsThreshold) { + XCTFail("Tokens per second below expected for compute unit \(currentTPS), potential CPU fallback") + } + + let transcriptionResult = mergeTranscriptionResults(transcriptionResults) - let transcriptionResult = try await XCTUnwrapAsync( - await whisperKit.transcribe(audioPath: audioFilePath, callback: callback).first, - "Transcription failed" + // Store final measurements + let (finalMemoryStats, finalLatencyStats) = await testState.processFinalMeasurements() + + memoryStats = finalMemoryStats + latencyStats = finalLatencyStats + + memoryStats.postTranscribeMemory = Float(AppMemoryChecker.getMemoryUsed()) + + let filename = String(audioFilePath.split(separator: "/").last!) + guard let reference = getTranscript(filename: filename) else { + Logging.debug("Reference transcript not found for \(filename)") + return + } + + let (wer, diff) = WERUtils.evaluate( + originalTranscript: reference, + generatedTranscript: transcriptionResult.text ) - XCTAssert(transcriptionResult.text.isEmpty == false, "Transcription failed") - memoryStats.postTranscribeMemory = Float(SystemMemoryChecker.getMemoryUsed()) + let modelSizeMB = try? getFolderSize(atUrl: whisperKit.modelFolder) + let testInfo = TestInfo( device: device, - audioFile: audioFilePath, - model: model, + audioFile: URL(fileURLWithPath: audioFilePath).lastPathComponent, + datasetDir: config.dataset, + datasetRepo: datasetRepo, + model: config.model, + modelSizeMB: modelSizeMB ?? -1, date: startTime.formatted(Date.ISO8601FormatStyle().dateSeparator(.dash)), timeElapsedInSeconds: Date().timeIntervalSince(startTime), timings: transcriptionResult.timings, - transcript: transcriptionResult.text + prediction: transcriptionResult.text, + reference: reference, + wer: wer, + diff: diff ) - let json = RegressionStats(testInfo: testInfo, memoryStats: memoryStats, latencyStats: latencyStats) - do { - let attachment = try XCTAttachment(data: json.jsonData(), uniformTypeIdentifier: "json") - attachment.lifetime = .keepAlways - attachment.name = "\(device)_\(model)_\(iso8601DateTimeString).json" - add(attachment) - } catch { - XCTFail("Failed with error: \(error)") + let staticAttributes = StaticAttributes( + encoderCompute: whisperKit.modelCompute.audioEncoderCompute, + decoderCompute: whisperKit.modelCompute.textDecoderCompute, + decodingOptions: config.decodingOptions + ) + let systemMeasurements = SystemMeasurements( + systemMemory: systemMemory, + diskSpace: diskSpace, + batteryLevel: batteryLevel, + thermalState: thermalState, + timeElapsed: timerTimeElapsed + ) + let json = RegressionStats( + testInfo: testInfo, + memoryStats: memoryStats, + latencyStats: latencyStats, + staticAttributes: staticAttributes, + systemMeasurements: systemMeasurements + ) + resultJSON.append(json) + } + + // MARK: - Pipeline Tests + + func testLargeWER() async { + let texts = await getWERTestData() + + let (simpleWER, simpleDiff) = WERUtils.evaluate(originalTranscript: "This is some basic text", generatedTranscript: "This is edited text with some words added replaced and deleted") + XCTAssertEqual(simpleWER, 1.7, accuracy: 0.1, "Expected wer: 1.7 but computed \(simpleWER)") + XCTAssertEqual(simpleDiff.count, 23) + if let originalText = texts.0, let generatedText = texts.1 { + let (werNormalized, _) = WERUtils.evaluate(originalTranscript: originalText, generatedTranscript: generatedText) + XCTAssertEqual(werNormalized, 0.1852080123266564, accuracy: 0.001, "Expected wer: 0.1852080123266564 but computed \(werNormalized)") + let (wer, _) = WERUtils.evaluate(originalTranscript: originalText, generatedTranscript: generatedText, normalizeOriginal: false) + XCTAssertEqual(wer, 0.42448103078024335, accuracy: 0.001, "Expected wer: 0.42448103078024335 but computed \(wer)") + } else { + XCTFail("Fetching WER test data failed.") } } - func testOutputAll() async throws { - let modelPaths = try allModelPaths() + func testNormalizer() { + let normalizer = EnglishTextNormalizer() + let jsonText = "hello\\u2026 this is a test over GH\\u20b5 94 million in fees in H\\u00f8rsholm and Basel grew 10% to one billions, 370 millions" + let swiftText = "hello\u{2026} this is a test over GH\u{20b5} 94 million in fees in H\u{00f8}rsholm and Basel grew 10% to one billions, 370 millions" + let resultJson = normalizer.normalize(text: jsonText) + let resultSwift = normalizer.normalize(text: swiftText) + XCTAssertEqual(resultSwift, "hello . this is a test over gh 94000000 in fees in horsholm and basel grew 10% to 1000000000s 370000000s") + XCTAssertEqual(resultJson, resultSwift) + } - for modelPath in modelPaths { - let modelName = modelPath.split(separator: "/").last! - print("[Integration] Testing model \(modelName)") - let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), - "Audio file not found" - ) + func testHirschberg() { + let s1 = "With a rumble that echoed through the night, thunder crashed overhead, its raw power shaking the earth beneath it, leaving in its wake an exhilarating sense of awe. As rain poured down in torrents, the thunder boomed with a rhythm that seemed to speak a secret language, intertwining nature's symphony with an innovative melody that captivated all who listened." + let s2 = "In the midst of a summer storm, thunder erupted with a booming chorus, shaking the earth beneath our feet and electrifying the air with its powerful presence. The crackling symphony of thunderbolts danced across the darkened sky, illuminating the clouds with an innovative display of nature's raw energy." + let ops = hirschberg(Array(s1.unicodeScalars), Array(s2.unicodeScalars)) + XCTAssertEqual(ops.count, 228) + } - let config = WhisperKitConfig(modelFolder: modelPath, verbose: true, logLevel: .debug) - let whisperKit = try await WhisperKit(config) + func testLevenshtein() { + let s1 = "With a rumble that echoed through the night, thunder crashed overhead, its raw power shaking the earth beneath it, leaving in its wake an exhilarating sense of awe. As rain poured down in torrents, the thunder boomed with a rhythm that seemed to speak a secret language, intertwining nature's symphony with an innovative melody that captivated all who listened." + let s2 = "In the midst of a summer storm, thunder erupted with a booming chorus, shaking the earth beneath our feet and electrifying the air with its powerful presence. The crackling symphony of thunderbolts danced across the darkened sky, illuminating the clouds with an innovative display of nature's raw energy." + let ops = levenshtein(Array(s1.unicodeScalars), Array(s2.unicodeScalars)) + XCTAssertEqual(ops.count, 495) + } - let transcriptionResult: [TranscriptionResult] = try await whisperKit.transcribe(audioPath: audioFilePath) - let transcriptionResultText = transcriptionResult.text + func testInMemoryAndDiskUsage() async throws { + // Choose a model to test + let modelToTest = "openai_whisper-tiny" - print("[Integration] \(transcriptionResultText)") - XCTAssertEqual( - transcriptionResultText.normalized, - " And so my fellow Americans ask not what your country can do for you, ask what you can do for your country.".normalized, - "Transcription result does not match expected result for model \(modelName)" - ) - } + // Get initial measurements + let initialMemory = AppMemoryChecker.getMemoryUsed() + let initialDiskSpace = DiskSpaceChecker.getDiskSpace() + let initialCacheSize = try getCacheSize() + + // Create WhisperKit instance + let whisperKit = try await WhisperKit(WhisperKitConfig( + model: modelToTest, + computeOptions: ModelComputeOptions(audioEncoderCompute: .cpuAndNeuralEngine, textDecoderCompute: .cpuAndNeuralEngine), + verbose: true, + logLevel: .debug, + load: true + )) + + // Get final measurements + let finalMemory = AppMemoryChecker.getMemoryUsed() + let finalDiskSpace = DiskSpaceChecker.getDiskSpace() + let finalCacheSize = try getCacheSize() + + // Calculate differences + let memoryUsed = finalMemory - initialMemory + let diskSpaceUsed = initialDiskSpace.freeSpaceGB! - finalDiskSpace.freeSpaceGB! + let cacheSpaceUsed = finalCacheSize - initialCacheSize + + // Log results + Logging.debug("Memory used by \(modelToTest): \(memoryUsed) MB") + Logging.debug("Disk space used by \(modelToTest): \(diskSpaceUsed) MB") + Logging.debug("Cache space used by \(modelToTest): \(cacheSpaceUsed) MB") + + // Assert that the measurements are within expected ranges + XCTAssertGreaterThan(memoryUsed, 0, "Model should use some memory") + XCTAssertLessThan(memoryUsed, 1000, "Model should use less than 1GB of memory") + + XCTAssertGreaterThanOrEqual(diskSpaceUsed, 0, "Model should use some disk space unless already downloaded") + XCTAssertLessThan(diskSpaceUsed, 5000, "Model should use less than 5GB of disk space") + + XCTAssertGreaterThanOrEqual(cacheSpaceUsed, 0, "Cache usage should not be negative") + + // Clean up + await whisperKit.unloadModels() } - func testRegressionAndLatencyForAllModels() async throws { - var allModels: [String] = [] - var failureInfo: [String: String] = [:] - var currentDevice = WhisperKit.deviceName() - let iso8601DateTimeString = ISO8601DateFormatter().string(from: Date()) + // MARK: - Helper Methods - #if os(macOS) && arch(arm64) - currentDevice = ProcessInfo.processor - #endif + private func downloadTestDataIfNeeded() { + guard audioFileURLs == nil || metadataURL == nil || testWERURLs == nil else { return } - do { - allModels = try await WhisperKit.fetchAvailableModels() - } catch { - XCTFail("Failed to fetch available models: \(error.localizedDescription)") + for dataset in datasets { + let expectation = XCTestExpectation(description: "Download test audio files for \(dataset) dataset") + downloadTestData(forDataset: dataset) { success in + if success { + expectation.fulfill() + } else { + XCTFail("Downloading audio file for testing failed") + } + } + wait(for: [expectation], timeout: 300) + } + } + + private func getTestMatrix() -> [TestConfig] { + var regressionTestConfigMatrix: [TestConfig] = [] + for dataset in datasets { + for computeOption in computeOptions { + for options in optionsToTest { + for model in modelsToTest { + regressionTestConfigMatrix.append( + TestConfig( + dataset: dataset, + modelComputeOptions: computeOption, + model: model, + decodingOptions: options + ) + ) + } + } + } } - for model in allModels { + return regressionTestConfigMatrix + } + + private func downloadTestData(forDataset dataset: String, completion: @escaping (Bool) -> Void) { + Task { do { - try await testAndMeasureModelPerformance(model: model, device: currentDevice) + Logging.debug("Available models: \(modelsToTest)") + + let testDatasetRepo = Hub.Repo(id: datasetRepo, type: .datasets) + let tempPath = FileManager.default.temporaryDirectory + let downloadBase = tempPath.appending(component: "huggingface") + let hubApi = HubApi(downloadBase: downloadBase) + let repoURL = try await hubApi.snapshot(from: testDatasetRepo, matching: ["\(dataset)/*"]) { progress in + Logging.debug("Downloading \(dataset) dataset: \(progress)") + }.appending(path: dataset) + + let downloadedFiles = try FileManager.default.contentsOfDirectory(atPath: repoURL.path()) + var audioFileURLs: [URL] = [] + for file in downloadedFiles { + if file.hasSuffix(".mp3") { + audioFileURLs.append(repoURL.appending(component: file)) + } else if file.hasSuffix(".json") { + self.metadataURL = repoURL.appending(component: file) + } + } + self.audioFileURLs = audioFileURLs + + Logging.debug("Downloaded \(audioFileURLs.count) audio files.") + + completion(true) } catch { - failureInfo[model] = error.localizedDescription + XCTFail("Async setup failed with error: \(error)") + completion(false) + } + } + } + + private func getTranscript(filename: String) -> String? { + // Ensure we can access and parse the metadata + guard let data = try? Data(contentsOf: self.metadataURL!), + let json = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]] + else { + return nil + } + + // Search for the matching audio item + for item in json { + // Check if the current item's audio matches the filename + let audioFileName = filename.split(separator: ".").first! + if let referenceFilename = item["audio"] as? String, + referenceFilename.contains(audioFileName) + { + // If found, return the reference text + return item["text"] as? String + } + } + + // If no matching item was found, return nil + return nil + } + + private func getWERTestData() async -> (String?, String?) { + do { + let testDataset = Hub.Repo(id: datasetRepo, type: .datasets) + let tempPath = FileManager.default.temporaryDirectory + let downloadBase = tempPath.appending(component: "huggingface") + let hubApi = HubApi(downloadBase: downloadBase) + let testWERRepoURL = try await hubApi.snapshot(from: testDataset, matching: ["*.txt"]) + let testWERTextURLs = try FileManager.default.contentsOfDirectory(atPath: testWERRepoURL.path()).filter { $0.hasSuffix(".txt") } + self.testWERURLs = testWERTextURLs.map { testWERRepoURL.appending(component: $0) } + + Logging.debug("Downloaded \(testWERTextURLs.count) test WER files.") + + let testFileURLs = try XCTUnwrap( + self.testWERURLs, + "Test files for WER verification not found" + ) + var generatedText: String? + var originalText: String? + for file in testFileURLs { + switch file.lastPathComponent { + case "test_generated_transcript.txt": + generatedText = try? String(contentsOf: file) + case "test_original_transcript.txt": + originalText = try? String(contentsOf: file) + default: + continue + } } + return (originalText, generatedText) + } catch { + XCTFail("Fetching test data for WER verification failed: \(error)") } - let testReport = TestReport(device: currentDevice, modelsTested: allModels, failureInfo: failureInfo) + return (nil, nil) + } + + private func saveSummary(failureInfo: [String: String], attachments: [String: String]) { + let currentDevice = getCurrentDevice() + let osDetails = getOSDetails() + let testReport = TestReport( + deviceModel: currentDevice, + osType: osDetails.osType, + osVersion: osDetails.osVersion, + modelsTested: modelsTested, + failureInfo: failureInfo, + attachments: attachments + ) + do { - let attachment = try XCTAttachment(data: testReport.jsonData(), uniformTypeIdentifier: "json") + let iso8601DateTimeString = ISO8601DateFormatter().string(from: Date()) + let jsonData = try testReport.jsonData() + let attachment = XCTAttachment(data: jsonData, uniformTypeIdentifier: UTType.json.identifier) attachment.lifetime = .keepAlways attachment.name = "\(currentDevice)_summary_\(iso8601DateTimeString).json" add(attachment) @@ -175,4 +570,118 @@ final class RegressionTests: XCTestCase { XCTFail("Failed with error: \(error)") } } + + private func getCurrentDevice() -> String { + var currentDevice = WhisperKit.deviceName() + + currentDevice = currentDevice.trimmingCharacters(in: .whitespacesAndNewlines) + currentDevice = currentDevice.replacingOccurrences(of: " ", with: "_") + + return currentDevice + } + + private func getOSDetails() -> (osType: String, osVersion: String) { + #if os(iOS) + return (UIDevice.current.systemName, UIDevice.current.systemVersion) + #elseif os(macOS) + let version = ProcessInfo.processInfo.operatingSystemVersion + return ("macOS", "\(version.majorVersion).\(version.minorVersion).\(version.patchVersion)") + #elseif os(watchOS) + return ("watchOS", WKInterfaceDevice.current().systemVersion) + #else + return ("Unknown", "Unknown") + #endif + } + + /// Helper function to get cache size + private func getCacheSize() throws -> Int64 { + let fileManager = FileManager.default + let cacheURL = fileManager.urls(for: .cachesDirectory, in: .userDomainMask) + let cacheSize = try fileManager.allocatedSizeOfDirectory(at: cacheURL.first!) + return cacheSize / (1024 * 1024) // Convert to MB + } + + private func getFolderSize(atUrl folder: URL?) throws -> Double { + guard let folder = folder else { + return -1 + } + let fileManager = FileManager.default + let modelSize = try fileManager.allocatedSizeOfDirectory(at: folder) + return Double(modelSize / (1024 * 1024)) // Convert to MB + } + + func createWithMemoryCheck( + model: String, + computeOptions: ModelComputeOptions, + verbose: Bool, + logLevel: Logging.LogLevel + ) async throws -> WhisperKit { + // Create the initialization task + let initializationTask = Task { () -> WhisperKit in + let whisperKit = try await WhisperKit(WhisperKitConfig( + model: model, + computeOptions: computeOptions, + verbose: verbose, + logLevel: logLevel, + prewarm: true, + load: true + )) + try Task.checkCancellation() + return whisperKit + } + + // Start the memory monitoring task + let monitorTask = Task { + while true { + let remainingMemory = SystemMemoryCheckerAdvanced.getMemoryUsage().totalAvailableGB + Logging.debug(remainingMemory, "GB of memory left") + + if remainingMemory <= 0.1 { // Cancel with 100MB remaining + Logging.debug("Cancelling due to oom") + // Cancel the initialization task + initializationTask.cancel() + + // Throw an error to stop the monitor task + throw WhisperError.modelsUnavailable("Memory limit exceeded during initialization") + } + + try await Task.sleep(nanoseconds: 1_000_000_000) // 1 second + } + } + + // Create a timeout task + let timeoutTask = Task { + try await Task.sleep(nanoseconds: 300_000_000_000) // 5 minutes + initializationTask.cancel() + monitorTask.cancel() + Logging.debug("Cancelling due to timeout") + throw WhisperError.modelsUnavailable("Initialization timed out") + } + + do { + // Use withTaskCancellationHandler to ensure proper cleanup + return try await withTaskCancellationHandler( + operation: { + // Await the initialization task + let whisperKit = try await initializationTask.value + + // Cancel the monitor tasks after successful initialization + monitorTask.cancel() + timeoutTask.cancel() + return whisperKit + }, + onCancel: { + initializationTask.cancel() + monitorTask.cancel() + timeoutTask.cancel() + } + ) + } catch { + initializationTask.cancel() + monitorTask.cancel() + timeoutTask.cancel() + Logging.debug(error) + throw error + } + } } diff --git a/Tests/WhisperKitTests/TestUtils.swift b/Tests/WhisperKitTests/TestUtils.swift index a2155c4..2e4201c 100644 --- a/Tests/WhisperKitTests/TestUtils.swift +++ b/Tests/WhisperKitTests/TestUtils.swift @@ -1,5 +1,8 @@ -import CoreML +// For licensing see accompanying LICENSE.md file. +// Copyright © 2024 Argmax, Inc. All rights reserved. + import Combine +import CoreML import Foundation @testable import WhisperKit import XCTest @@ -72,6 +75,33 @@ func XCTAssertNoThrowAsync( // MARK: Helpers +@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) +extension Bundle { + static var current: Bundle { + #if SWIFT_PACKAGE + return Bundle.module + #else + return Bundle.main + #endif + } +} + +@available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) +extension FileManager { + func allocatedSizeOfDirectory(at url: URL) throws -> Int64 { + guard let enumerator = enumerator(at: url, includingPropertiesForKeys: [.totalFileAllocatedSizeKey, .fileAllocatedSizeKey]) else { + throw NSError(domain: NSCocoaErrorDomain, code: NSFileReadUnknownError, userInfo: nil) + } + + var accumulatedSize: Int64 = 0 + for case let fileURL as URL in enumerator { + let resourceValues = try fileURL.resourceValues(forKeys: [.totalFileAllocatedSizeKey, .fileAllocatedSizeKey]) + accumulatedSize += Int64(resourceValues.totalFileAllocatedSize ?? resourceValues.fileAllocatedSize ?? 0) + } + return accumulatedSize + } +} + @available(macOS 13, iOS 16, watchOS 10, visionOS 1, *) extension MLMultiArray { /// Create `MLMultiArray` of shape [1, 1, arr.count] and fill up the last @@ -128,7 +158,7 @@ extension XCTestCase { trackForMemoryLeaks(on: whisperKit, file: file, line: line) let audioComponents = audioFile.components(separatedBy: ".") - guard let audioFileURL = Bundle.module.path(forResource: audioComponents.first, ofType: audioComponents.last) else { + guard let audioFileURL = Bundle.current.path(forResource: audioComponents.first, ofType: audioComponents.last) else { throw TestError.missingFile("Missing audio file") } return try await whisperKit.transcribe(audioPath: audioFileURL, decodeOptions: options, callback: callback) @@ -136,7 +166,7 @@ extension XCTestCase { func tinyModelPath() throws -> String { let modelDir = "whisperkit-coreml/openai_whisper-tiny" - guard let modelPath = Bundle.module.urls(forResourcesWithExtension: "mlmodelc", subdirectory: modelDir)?.first?.deletingLastPathComponent().path else { + guard let modelPath = Bundle.current.urls(forResourcesWithExtension: "mlmodelc", subdirectory: modelDir)?.first?.deletingLastPathComponent().path else { throw TestError.missingFile("Failed to load model, ensure \"Models/\(modelDir)\" exists via Makefile command: `make download-models`") } return modelPath @@ -144,7 +174,7 @@ extension XCTestCase { func largev3ModelPath() throws -> String { let modelDir = "whisperkit-coreml/openai_whisper-large-v3" // use faster to compile model for tests - guard let modelPath = Bundle.module.urls(forResourcesWithExtension: "mlmodelc", subdirectory: modelDir)?.first?.deletingLastPathComponent().path else { + guard let modelPath = Bundle.current.urls(forResourcesWithExtension: "mlmodelc", subdirectory: modelDir)?.first?.deletingLastPathComponent().path else { throw TestError.missingFile("Failed to load model, ensure \"Models/\(modelDir)\" exists via Makefile command: `make download-models`") } return modelPath @@ -152,7 +182,7 @@ extension XCTestCase { func largev3TurboModelPath() throws -> String { let modelDir = "whisperkit-coreml/openai_whisper-large-v3_turbo" - guard let modelPath = Bundle.module.urls(forResourcesWithExtension: "mlmodelc", subdirectory: modelDir)?.first?.deletingLastPathComponent().path else { + guard let modelPath = Bundle.current.urls(forResourcesWithExtension: "mlmodelc", subdirectory: modelDir)?.first?.deletingLastPathComponent().path else { throw TestError.missingFile("Failed to load model, ensure \"Models/\(modelDir)\" exists via Makefile command: `make download-models`") } return modelPath @@ -163,7 +193,7 @@ extension XCTestCase { var modelPaths: [String] = [] let directory = "whisperkit-coreml" let resourceKeys: [URLResourceKey] = [.isDirectoryKey] - guard let baseurl = Bundle.module.resourceURL?.appendingPathComponent(directory) else { + guard let baseurl = Bundle.current.resourceURL?.appendingPathComponent(directory) else { throw TestError.missingDirectory("Base URL for directory \(directory) not found.") } let directoryContents = try fileManager.contentsOfDirectory(at: baseurl, includingPropertiesForKeys: resourceKeys, options: .skipsHiddenFiles) @@ -277,8 +307,8 @@ extension Collection where Element == TranscriptionResult { } } -extension Publisher { - public func withPrevious() -> AnyPublisher<(previous: Output?, current: Output), Failure> { +public extension Publisher { + func withPrevious() -> AnyPublisher<(previous: Output?, current: Output), Failure> { scan((Output?, Output)?.none) { ($0?.1, $1) } .compactMap { $0 } .eraseToAnyPublisher() diff --git a/Tests/WhisperKitTests/UnitTests.swift b/Tests/WhisperKitTests/UnitTests.swift index cabebe9..e633558 100644 --- a/Tests/WhisperKitTests/UnitTests.swift +++ b/Tests/WhisperKitTests/UnitTests.swift @@ -58,7 +58,7 @@ final class UnitTests: XCTestCase { func testModelSupportConfigFromJson() throws { let configFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "config", ofType: "json"), + Bundle.current.path(forResource: "config", ofType: "json"), "Config file not found" ) @@ -190,7 +190,7 @@ final class UnitTests: XCTestCase { func testAudioFileLoading() throws { let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) let audioBuffer = try AudioProcessor.loadAudio(fromPath: audioFilePath) @@ -211,7 +211,7 @@ final class UnitTests: XCTestCase { func testAudioFileLoadingWithResampling() throws { let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk_441khz", ofType: "m4a"), + Bundle.current.path(forResource: "jfk_441khz", ofType: "m4a"), "Audio file not found" ) let audioBuffer = try AudioProcessor.loadAudio(fromPath: audioFilePath) @@ -256,7 +256,7 @@ final class UnitTests: XCTestCase { func testAudioResample() throws { let audioFileURL = try XCTUnwrap( - Bundle.module.url(forResource: "jfk", withExtension: "wav"), + Bundle.current.url(forResource: "jfk", withExtension: "wav"), "Audio file not found" ) let audioFile = try AVAudioFile(forReading: audioFileURL) @@ -275,7 +275,7 @@ final class UnitTests: XCTestCase { func testAudioResampleFromFile() throws { let audioFileURL = try XCTUnwrap( - Bundle.module.url(forResource: "jfk", withExtension: "wav"), + Bundle.current.url(forResource: "jfk", withExtension: "wav"), "Audio file not found" ) let audioFile = try AVAudioFile(forReading: audioFileURL) @@ -552,7 +552,7 @@ final class UnitTests: XCTestCase { let options = DecodingOptions() let continuationCallback: TranscriptionCallback = { (progress: TranscriptionProgress) -> Bool? in // Stop after only 10 tokens (full test audio contains 16) - return progress.tokens.count <= earlyStopTokenCount + progress.tokens.count <= earlyStopTokenCount } let result = try await XCTUnwrapAsync( @@ -650,7 +650,7 @@ final class UnitTests: XCTestCase { let whisperKit = try await WhisperKit(config) let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) let audioBuffer = try AudioProcessor.loadAudio(fromPath: audioFilePath) @@ -777,7 +777,7 @@ final class UnitTests: XCTestCase { let whisperKit = try await WhisperKit(config) let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "es_test_clip", ofType: "wav"), + Bundle.current.path(forResource: "es_test_clip", ofType: "wav"), "Audio file not found" ) @@ -852,7 +852,7 @@ final class UnitTests: XCTestCase { let whisperKit = try await WhisperKit(config) let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "ja_test_clip", ofType: "wav"), + Bundle.current.path(forResource: "ja_test_clip", ofType: "wav"), "Audio file not found" ) @@ -903,7 +903,7 @@ final class UnitTests: XCTestCase { for language in targetLanguages { let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "\(language)_test_clip", ofType: "wav"), + Bundle.current.path(forResource: "\(language)_test_clip", ofType: "wav"), "Audio file not found" ) @@ -1281,7 +1281,7 @@ final class UnitTests: XCTestCase { XCTAssertTrue(vad.voiceActivity(in: []).isEmpty) let audioFilePath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) let audioBuffer = try AudioProcessor.loadAudio(fromPath: audioFilePath) @@ -1385,7 +1385,7 @@ final class UnitTests: XCTestCase { let chunker = VADAudioChunker() let singleChunkPath = try XCTUnwrap( - Bundle.module.path(forResource: "jfk", ofType: "wav"), + Bundle.current.path(forResource: "jfk", ofType: "wav"), "Audio file not found" ) var audioBuffer = try AudioProcessor.loadAudio(fromPath: singleChunkPath) @@ -1400,7 +1400,7 @@ final class UnitTests: XCTestCase { XCTAssertEqual(audioChunks.count, 1) let multiChunkPath = try XCTUnwrap( - Bundle.module.path(forResource: "ted_60", ofType: "m4a"), + Bundle.current.path(forResource: "ted_60", ofType: "m4a"), "Audio file not found" ) audioBuffer = try AudioProcessor.loadAudio(fromPath: multiChunkPath) @@ -1456,7 +1456,7 @@ final class UnitTests: XCTestCase { } } _ = try await pipe.transcribe( - audioPath: Bundle.module.path(forResource: "ted_60", ofType: "m4a")!, + audioPath: Bundle.current.path(forResource: "ted_60", ofType: "m4a")!, decodeOptions: .init(chunkingStrategy: .vad) ) cancellable?.cancel() @@ -1788,7 +1788,7 @@ final class UnitTests: XCTestCase { let startTime = Date() let audioComponents = audioFile.components(separatedBy: ".") - guard let audioFileURL = Bundle.module.path(forResource: audioComponents.first, ofType: audioComponents.last) else { + guard let audioFileURL = Bundle.current.path(forResource: audioComponents.first, ofType: audioComponents.last) else { XCTFail("Audio file not found") return } diff --git a/fastlane/Fastfile b/fastlane/Fastfile new file mode 100644 index 0000000..2811e1b --- /dev/null +++ b/fastlane/Fastfile @@ -0,0 +1,447 @@ +# For licensing see accompanying LICENSE.md file. +# Copyright © 2024 Argmax, Inc. All rights reserved. + +# This file contains the fastlane.tools configuration +# You can find the documentation at https://docs.fastlane.tools +# +# For a list of all available actions, check out +# https://docs.fastlane.tools/actions +# +# For a list of all available plugins, check out +# https://docs.fastlane.tools/plugins/available-plugins + +require 'date' +require 'fileutils' +require 'json' +require 'pathname' + +COMMIT_HASH = `git rev-parse --short HEAD`.strip +COMMIT_TIMESTAMP = `git log -1 --format=%ct`.strip +COMMIT_TIMESTAMP = Time.at(COMMIT_TIMESTAMP.to_i).utc.strftime('%Y-%m-%dT%H%M%S') +XCODE_TEAM_ID = `defaults read com.apple.dt.Xcode IDEProvisioningTeamManagerLastSelectedTeamID`.strip +WORKING_DIR = Dir.pwd +BASE_BENCHMARK_PATH = "#{WORKING_DIR}/benchmark_data".freeze +BASE_UPLOAD_PATH = "#{WORKING_DIR}/upload_folder".freeze +XCRESULT_PATH = File.expand_path("#{BASE_BENCHMARK_PATH}/#{COMMIT_TIMESTAMP}_#{COMMIT_HASH}/") +BENCHMARK_REPO = 'argmaxinc/whisperkit-evals-dataset'.freeze +BENCHMARK_CONFIGS = { + full: { + test_identifier: 'WhisperAXTests/RegressionTests/testModelPerformance', + name: 'full', + models: [ + 'openai_whisper-tiny', + 'openai_whisper-tiny.en', + 'openai_whisper-base', + 'openai_whisper-base.en', + 'openai_whisper-small', + 'openai_whisper-small.en', + 'openai_whisper-large-v2', + 'openai_whisper-large-v2_949MB', + 'openai_whisper-large-v2_turbo', + 'openai_whisper-large-v2_turbo_955MB', + 'openai_whisper-large-v3', + 'openai_whisper-large-v3_947MB', + 'openai_whisper-large-v3_turbo', + 'openai_whisper-large-v3_turbo_954MB', + 'distil-whisper_distil-large-v3', + 'distil-whisper_distil-large-v3_594MB', + 'distil-whisper_distil-large-v3_turbo', + 'distil-whisper_distil-large-v3_turbo_600MB', + 'openai_whisper-large-v3-v20240930', + 'openai_whisper-large-v3-v20240930_turbo', + 'openai_whisper-large-v3-v20240930_626MB', + 'openai_whisper-large-v3-v20240930_turbo_632MB' + ] + }, + debug: { + test_identifier: 'WhisperAXTests/RegressionTests/testModelPerformanceWithDebugConfig', + name: 'debug', + models: ['tiny', 'crash_test', 'unknown_model', 'small.en'] + } +}.freeze + +default_platform(:ios) + +platform :ios do + desc 'List all connected devices' + lane :list_devices do + devices = available_devices + UI.message 'Connected devices:' + devices.each do |device| + UI.message device + end + end + + desc 'Benchmark devices with options' + lane :benchmark do |options| + devices = options[:devices] + config = options[:debug] ? :debug : :full + + if devices + devices = devices.split(',').map(&:strip) if devices.is_a?(String) + benchmark_specific_devices(devices: devices, config: config) + else + benchmark_connected_devices(config: config) + end + end + + # Update the extract_results lane to match + desc 'Extract benchmark results' + lane :extract_results do |options| + # CLI Comment: To use a specific result bundle path, pass it as an option like this: `fastlane extract_results result_bundle_path:/path/to/your/xcresult` + # CLI Comment: If no path is provided, the default path will be used based on the commit hash and configuration. + devices = options[:devices] + config = options[:debug] ? :debug : :full + + # Use the provided result bundle path if available, otherwise use the default + xcresult_bundle = options[:result_bundle_path] || "WhisperAX_#{COMMIT_HASH}_#{BENCHMARK_CONFIGS[config][:name]}.xcresult" + # Ensure the path is expanded to an absolute path + xcresult_bundle = File.expand_path(xcresult_bundle) + + devices = devices.split(',').map(&:strip) if devices && devices.is_a?(String) + + extract_xcresult_attachments(xcresult_bundle, devices: devices, config: config) + end + + desc 'Upload benchmark results' + lane :upload_results do |_options| + UI.message 'Uploading benchmark results to Hugging Face dataset...' + upload_results + end +end + +def available_devices + # Run the devicectl command, capturing only stdout (JSON output) + devices_json = sh('xcrun devicectl list devices --json-output - 2>/dev/null', log: false) + + # Read and parse the JSON file + devices_data = JSON.parse(devices_json) + + # Extract device information + devices = devices_data['result']['devices'].map do |device| + device_name = device['deviceProperties']['name'] + device_type = device['hardwareProperties']['marketingName'] + platform = device['hardwareProperties']['platform'] + os_version = device['deviceProperties']['osVersionNumber'] + device_product = device['hardwareProperties']['productType'] + udid = device['hardwareProperties']['udid'] + state = device['connectionProperties']['tunnelState'] + + unless device_type && (device_type.include?('iPhone') || device_type.include?('iPad')) && !state.include?('unavailable') + next + end + + { + name: device_name, + type: device_type, + platform: platform, + os_version: os_version, + product: device_product, + id: udid, + state: state + } + end.compact + + # Add the current Mac + mac_system_info = JSON.parse(sh('system_profiler SPHardwareDataType -json', log: false)) + mac_type = mac_system_info['SPHardwareDataType'][0]['chip_type'] + mac_name = mac_system_info['SPHardwareDataType'][0]['machine_model'] + mac_udid = mac_system_info['SPHardwareDataType'][0]['platform_UUID'] + mac_info = `sw_vers` + mac_version = mac_info.match(/ProductVersion:\s+(.+)/)[1] + mac_platform = mac_info.match(/ProductName:\s+(.+)/)[1] + + devices << { + name: 'My Mac', + type: mac_type, + platform: mac_platform, + os_version: mac_version, + product: mac_name, + id: mac_udid, + state: 'connected' + } + + devices +end + +def benchmark_connected_devices(config:) + run_benchmarks(devices: available_devices, config: config) +end + +desc 'Benchmark specific devices' +def benchmark_specific_devices(devices:, config:) + all_devices = available_devices + selected_devices = all_devices.select { |device| devices.include?(device[:name]) } + + UI.user_error!("No matching devices found for the names provided: #{devices}") if selected_devices.empty? + + run_benchmarks(devices: selected_devices, config: config) +end + +def run_benchmarks(devices:, config:) + UI.user_error!('No matching devices found.') if devices.empty? + + UI.message "Devices to benchmark (#{BENCHMARK_CONFIGS[config][:name]} mode):" + devices.each { |device| UI.message device } + + # Remove existing xcresults that start with device[:product] + devices.each do |device| + product_pattern = File.join(XCRESULT_PATH, "#{device[:product]}*") + UI.message "Removing existing xcresults for device: #{device[:product]}" + # Check if the file exists + if Dir.glob(product_pattern).any? + sh("trash #{product_pattern}") + else + UI.message "No xcresults found for device: #{device[:product]}" + end + end + + team_id = XCODE_TEAM_ID + if team_id.empty? + UI.user_error!('Development Team ID not found. Please log into Xcode with your Apple ID.') + else + UI.message("Using Development Team ID: #{team_id}") + end + + run_benchmark(devices, config) +end + +def run_benchmark(devices, config) + summaries = [] + BENCHMARK_CONFIGS[config][:models].each do |model| + begin + # Sanitize device name for use in file path + devices_to_test = devices.map { |device_info| device_info[:name] }.compact + destinations = devices.map do |device_info| + "platform=#{device_info[:platform]},name=#{device_info[:name]}" + end.compact + + UI.message "Output path: #{XCRESULT_PATH}" + + # Ensure the directory exists + FileUtils.mkdir_p(XCRESULT_PATH) + + # Generate a unique name for the xcresult bundle + result_bundle_name = "WhisperAX_#{COMMIT_HASH}_#{BENCHMARK_CONFIGS[config][:name]}.xcresult" + + # Safely remove any existing xcresult bundle + xcresult_bundle = File.join(XCRESULT_PATH, result_bundle_name) + + if File.exist?(xcresult_bundle) + UI.message "Removing existing xcresult bundle: #{xcresult_bundle}" + sh("trash #{xcresult_bundle}") + end + + UI.message "Running scan with result bundle path: #{xcresult_bundle}" + UI.message "Running in #{BENCHMARK_CONFIGS[config][:name]} mode" + + UI.message "Running benchmark for model: #{model}" + xcargs = [ + "MODEL_NAME=#{model}", + '-allowProvisioningUpdates', + "DEVELOPMENT_TEAM=#{XCODE_TEAM_ID}" + ].join(' ') + + scan_result = scan( + project: 'Examples/WhisperAX/WhisperAX.xcodeproj', + scheme: 'WhisperAX', + clean: false, + devices: devices_to_test, + skip_detect_devices: true, + only_testing: [BENCHMARK_CONFIGS[config][:test_identifier]], + xcargs: xcargs, + destination: destinations, + result_bundle_path: xcresult_bundle, + output_directory: XCRESULT_PATH, + suppress_xcode_output: false, + result_bundle: true, + # show_xcode_test_logs: true + buildlog_path: XCRESULT_PATH, + # include_simulator_logs: true, + output_style: 'raw', + fail_build: false + ) + extract_xcresult_attachments(xcresult_bundle, devices: devices, config: config) + summaries << { model: model, success: scan_result } + rescue StandardError => e + UI.error('Model failed. Continuing with next model') + UI.message(e.message) + summaries << { model: model, success: false, error: e.message } + end + end + + merge_all_summaries(summaries, devices, config) +end + +def extract_xcresult_attachments(xcresult_bundle, devices:, config:) + UI.message "Starting extraction of attachments from #{xcresult_bundle}..." + + # Check if the file exists + if File.exist?(xcresult_bundle) + UI.success "xcresult file found at: #{xcresult_bundle}" + else + UI.error "xcresult file does not exist at: #{xcresult_bundle}" + UI.message "Current directory contents of #{xcresult_bundle}:" + Dir.glob(File.join(File.dirname(xcresult_bundle), '*')).each do |file| + UI.message " #{file}" + end + return + end + + # Get all ActionTestSummary IDs + UI.message 'Fetching ActionTestSummary IDs...' + xcode_version = `xcodebuild -version | grep Xcode`.gsub('Xcode ', '') + legacy_flag = xcode_version.to_f >= 16 ? '--legacy' : '' + UI.message "Legacy flag: Xcode version - #{xcode_version.to_i}, flag - #{legacy_flag}" + + graph_output = sh("xcrun xcresulttool graph #{legacy_flag} --path '#{xcresult_bundle}' | grep ActionTestSummary -A1 | grep Id", + log: true) + action_test_summary_ids = graph_output.split("\n").map { |line| line.split.last } + UI.message "Found #{action_test_summary_ids.count} ActionTestSummary IDs" + + action_test_summary_ids.each_with_index do |xcid, index| + UI.message "Processing ActionTestSummary ID #{index + 1} of #{action_test_summary_ids.count}: #{xcid}" + + json_output = sh("xcrun xcresulttool get #{legacy_flag} --format json --path '#{xcresult_bundle}' --id '#{xcid}'", + log: false) + parsed_json = JSON.parse(json_output) + + attachments = parsed_json.dig('activitySummaries', '_values')&.flat_map do |summary| + summary.dig('attachments', '_values') + end&.compact || [] + + UI.message "Found #{attachments.count} attachments for this summary" + + attachments.each_with_index do |attachment, att_index| + ref = attachment.dig('payloadRef', 'id', '_value') + filename = attachment['filename']['_value'] + + if ref && filename + UI.message "Extracting attachment #{att_index + 1} of #{attachments.count}: #{filename}" + + output_path = File.join(File.dirname(xcresult_bundle), filename) + sh("xcrun xcresulttool get #{legacy_flag} --path '#{xcresult_bundle}' --id '#{ref}' > '#{output_path}'") + + if File.exist?(output_path) + UI.success "Successfully extracted: #{output_path}" + else + UI.error "Failed to extract: #{output_path}" + end + else + UI.error "Invalid attachment data for attachment #{att_index + 1}" + end + end + end + + UI.success "Extraction complete. Total ActionTestSummaries processed: #{action_test_summary_ids.count}" +end + +def merge_all_summaries(summaries, devices, _config) + files_to_upload = [] + devices.each do |device| + merged_data = { + osType: device[:platform], + failureInfo: {}, + osVersion: device[:os_version], + modelsTested: [], + deviceModel: device[:type], + deviceIdentifier: device[:product], + testResults: {}, + commitHash: COMMIT_HASH, + commitTimestamp: COMMIT_TIMESTAMP + } + + summaries.each do |result| + UI.message "Test result from fastlane: #{result}" + model = result[:model] + + unless result[:success] + merged_data[:modelsTested] << model + merged_data[:failureInfo][model] = result[:error] || 'Test failed' + end + end + + # Merge data from extracted xcresult attachments + summary_pattern = File.join(XCRESULT_PATH, "#{device[:product]}_summary_*.json".gsub(/\s+/, '_')) + Dir.glob(summary_pattern).each do |file| + attachment_data = JSON.parse(File.read(file)) + merged_data[:failureInfo].merge!(attachment_data['failureInfo']) if attachment_data['failureInfo'] + merged_data[:modelsTested] |= attachment_data['modelsTested'] if attachment_data['modelsTested'] + UI.message "Merging data from: #{file} #{attachment_data}" + end + + merged_data[:modelsTested].each do |model| + # Store the test result file path + result_pattern = File.join(XCRESULT_PATH, "#{device[:product]}_#{model.gsub('.', '_')}_20*.json".gsub(/\s+/, '_')) + merged_data[:testResults][model] = [] + Dir.glob(result_pattern).each do |file| + merged_data[:testResults][model] << File.basename(file) + files_to_upload << file + end + end + timestamp = Time.now.strftime('%Y-%m-%dT%H%M%S') + filename = "#{device[:product]}_summary_#{timestamp}.json".gsub(/\s+/, '_') + file_path = File.join(XCRESULT_PATH, filename) + + FileUtils.mkdir_p(File.dirname(file_path)) + File.write(file_path, JSON.pretty_generate(merged_data)) + UI.message "Created merged summary: #{file_path}" + files_to_upload << file_path + + prepare_upload(files_to_upload) + end +end + +def prepare_upload(files) + UI.message 'Preparing upload folder...' + upload_folder = File.expand_path(BASE_UPLOAD_PATH) + + # Clear out the existing upload folder + if Dir.exist?(upload_folder) && !Dir.glob("#{upload_folder}/*").empty? + UI.message "Clearing existing upload folder: #{upload_folder}" + sh("trash #{upload_folder}/*") + else + UI.message "Upload folder does not exist or is empty, creating: #{upload_folder}" + FileUtils.mkdir_p(upload_folder) + end + + # Copy the new data to the upload folder + files.each do |file| + relative_path = Pathname.new(file).relative_path_from(Pathname.new(WORKING_DIR)).to_s + destination = File.join(upload_folder, relative_path) + + UI.message "Copying #{file} to #{upload_folder}/#{relative_path}" + + # Ensure the destination directory exists + FileUtils.mkdir_p(File.dirname(destination)) + FileUtils.cp(file, destination) + end +end + +def upload_results + upload_folder = File.expand_path(BASE_UPLOAD_PATH) + + # Ensure the upload folder exists + unless Dir.exist?(upload_folder) + UI.user_error!("Upload folder does not exist: #{upload_folder}") + return + end + + # Get the git hash and timestamp for the PR branch name + timestamp = Time.now.strftime('%Y%m%d_%H%M%S') + branch_name = "benchmark_results_#{timestamp}" + + # Construct the huggingface-cli command + cmd = "huggingface-cli upload #{BENCHMARK_REPO} '#{upload_folder}' --repo-type dataset --create-pr" + + UI.message "Executing command: #{cmd}" + + # Execute the command + begin + result = sh(cmd) + UI.success 'Successfully uploaded benchmark results and created a pull request.' + UI.message "Command output: #{result}" + rescue StandardError => e + UI.error "Failed to upload benchmark results: #{e.message}" + end +end diff --git a/fastlane/README.md b/fastlane/README.md new file mode 100644 index 0000000..d850eb6 --- /dev/null +++ b/fastlane/README.md @@ -0,0 +1,56 @@ +fastlane documentation +---- + +# Installation + +Make sure you have the latest version of the Xcode command line tools installed: + +```sh +xcode-select --install +``` + +For _fastlane_ installation instructions, see [Installing _fastlane_](https://docs.fastlane.tools/#installing-fastlane) + +# Available Actions + +## iOS + +### ios list_devices + +```sh +[bundle exec] fastlane ios list_devices +``` + +List all connected devices + +### ios benchmark + +```sh +[bundle exec] fastlane ios benchmark +``` + +Benchmark devices with options + +### ios extract_results + +```sh +[bundle exec] fastlane ios extract_results +``` + +Extract benchmark results + +### ios upload_results + +```sh +[bundle exec] fastlane ios upload_results +``` + +Upload benchmark results + +---- + +This README.md is auto-generated and will be re-generated every time [_fastlane_](https://fastlane.tools) is run. + +More information about _fastlane_ can be found on [fastlane.tools](https://fastlane.tools). + +The documentation of _fastlane_ can be found on [docs.fastlane.tools](https://docs.fastlane.tools).